Search in 2008: The Listings
As you know from my prior post, gathering, processing and assembling the GIS data took almost four months. During the long downloads, I had time to think of what was to come next. I decided to start collecting business listing data, which I sourced from a few different places. I’m not going to discuss where I got it, because I had consulted an attorney who warned me in explicit terms that there could be problems in sourcing the data from anyone because of copyright problems.
I came up with a strategy that would help get the listings started but not expose me to any copyright problems. You may be aware that many map companies who publish physical maps and online maps put in phony streets and other features that don’t exist in the real world. If they happen to look up that feature in your system and they find it, they know you did not collect the data yourself, you copied theirs. So what to do?
I purchased a list I could use for marketing purposes. Since I intended to physically check each business anyway, the bulk list simply acted as a guide for me to make sure that I covered my bases. As I processed this data one by one, I discovered there was at least a 30% error rate in terms of business name, location and whether the business even existed. This database also had latitude and longitude, but this was usually off by several hundred feet.
The daily routine was to go out with a Ricoh GPS camera and take a picture of the storefront, making sure to get the number and suite if there was one. This took less than 10 seconds when you got the rhythm down. After getting a good batch, the pictures were downloaded onto a workstation and processed into the database with a program I wrote. The images were assigned to a listing, and a pointer appeared on the georectified images I had just spent months compiling. By positioning the pointer on the map, I could accurately get the latitude and longitude of the storefront.
I had a volunteer help me at this stage, but I did pay for his gas. He was unemployed but his wife worked and he was interested in helping the project. He ended up shooting all businesses in these California areas: Bonsall, Fallbrook, Big Bear and almost all of Temecula. I did Red Mountain and Trona in California, just to make sure far flung locations would work. After only two months we had close to 14,000 listings in the database, hand verified.
What we discovered was that our verified list did not correspond to anything anyone else had. Local chambers of commerce had a fraction of the businesses as members. Other local merchant associations had no idea actually who did business in their town. I found this rather puzzling, but I felt the project was on the right track.
It also occurred to me that I had shot a lot of empty buildings, along with their for lease signs. I started to wonder if I could also help people who were looking for a business location search for available suites or buildings that were unoccupied. I continued to track these locations and update them in the database.
By now, the project had made it to May of 2008. I had deepened my skills at Transact SQL and developed my own stored procedure and function templates. I spent a lot of time refactoring my code so it was consistent and easy to read. Someone else would have to maintain this later and I wanted it to be clear and easy to follow. Getting to work on code around the clock was without equal as a creative exploit.