The Finditbyme search engine was a two year, 8,000 hour project that broke new ground for me technically.
In 2007, I began my almost decade long plan to build a local search engine.
The technology stack:
- MS SQL Server 2008
- 300 stored procedures
- 80 table valued functions
- 100 scalar valued functions
- 5 table triggers
- VB.NET 2008
- IIS web server
- TatukGIS mapping library
- Gizmox VisualWebGUI (which allows a web forms based application to actually work on the open Internet).
I downloaded and processed gigabytes of aerial and satellite photography from all parts of California, along with vector data covering roads, airports, places of interest and landmarks.
I and a volunteer photographed store fronts of over 12,000 businesses and other locations throughout California with a GPS enabled camera. I processed each image and used the coordinates from each image to exactly position each of those 12,000 locations along a road segment. This avoided location interpolation used in most mapping products, which results in wild errors in GPS enabled vehicles and other navigational devices.
I next started on the actual search technology. I wanted the system to be usable world wide, so I made it multi-lingual.
I wanted to support two search behaviors: broad, category based searching, as well as more specific keyword based searches.
To achieve that, I built digraph (directed graph) structures in SQL Server organized as dual ontologies: one for search patterns (broad to specific), and one for language interpretation for the purposes of search. Using the ontology model, I constructed a a database of lexemes represented by a tokenised language that represented various objects and relationships between them.
The tokens were translated into 31 different world languages as proof of concept. Each ontological variation in the search relationships then had 31 different search patterns in each of the languages encoded.
The search engine lets you mix languages, even grammar while performing a search, and the results are amazing. I conceived of a new way of representing lexical and knowledge relationships, which I will incorporate in a new version using a graph based database like Neo4J.
I return all results in XML to make the output consumable by multiple sources, including web services. I designed a matrix based page layout, and designed a XAML based layout design system, which allowed me to display control elements anywhere on the page.
In addition to the flexible design surface, I use the multilingual ontology system to render web page controls in any target language.