Skip to main content

Posts

Showing posts from December, 2016

Virtual Private Servers (VPS) and Cloud hosting are now viable

For many, many years, I was a massive fan of dedicated web hosting. I was VERY vocal about how you couldn't run a legitimate, professional business without using dedicated web hosting. And time and time again, I was proven right as people on shared web hosting came out of the woodwork in various places who had bet their business on shared hosting and lost - and sometimes they lost EVERYTHING including their business and all their customers! Shared web hosting is still the bottom of the barrel, scummy/scammy money grab that it has always been and no respectable business should be caught dead running their web infrastructure on it. Period. That hasn't changed. However, I have been watching a couple of new stars grow from infancy into its own over the past 8 years: Virtual Private Servers, aka VPS, and its newer, shinier cousin Cloud Hosting. Dedicated web hosting is expensive. It has always been because you get a piece of hardware, a network drop, electricity, a tran

You can still buy a brand new Dot Matrix printer...

Today, I learned that people still buy brand new dot matrix printers. You know, those extremely noisy printers I thought we ditched as soon as it was possible to do so. Well, except for the nutcases who turn them into "musical instruments" and start a YouTube channel: But, no, sales of brand new(!) dot matrix printers are apparently still, relatively-speaking, alive and well: Dot matrix printers on Newegg After doing some research, it turns out that, for bulk printing where output quality and "professional" appearance doesn't matter at all, dot matrix printers can be anywhere from 4 to 8 times cheaper than laser printers per printed page (the next cheapest technology) when amortized over the cost of maintenance of the lifetime of each type of printer. With dot matrix, you're not going to get the speed, accuracy, or the quietness of laser, but you'll supposedly save a boatload of money on toner. Maybe one day we will get a printer that combin

Bulk web scraping a website and then reselling the content can land you in legal hot water

This interesting article on web scraping just came to my attention: New York Times: Auction Houses Face Off in Website Data Scraping Lawsuit Summary: An auction house in New York is suing an auction house in Dallas for copyright law violations regarding scraping the New York auction house's website listings including their listing photos and then SELLING those listings and photos in an aggregate database for profit. As I'm the author of one of the most powerful and flexible web scraping toolkits (the Ultimate Web Scraping Toolkit ), I have to reiterate the messaging found on the main documentation page: Don't use the toolkit for illegal purposes! If you are going to bulk scrape someone's website, you need to make sure you are free and clear legally for doing so and that you respect reasonable rate limits and the like. Reselling the data acquired with a scraping toolkit seems like an extremely questionable thing to do from a legal perspective. The problem wi

Setting up your own Root Certificate Authority - the right way!

Setting up your own Root Certificate Authority, aka Root CA, can be a difficult process. Web browsers and e-mail clients won't recognize your CA out-of-the-box, so most people opt to use public CA infrastructure. When security matters, using a public CA is the wrong solution. Privately owned and controlled CAs can be infinitely more secure than their public counterparts. However, most people who set up a private CA don't set up their CA infrastructure correctly. Here is what most private CAs look like: Root CA cert -> Server cert This is wrong because the server certificate has to be regenerated regularly (e.g. annually). If the root certificate is compromised, then it involves fairly significant effort to replace all of the certificates, including the root. What should be built is this: Root CA cert -> Intermediate cert -> Server cert In fact, this is the format that most public CAs use. The root CA cert is generated on a machine that isn't conne