Tiny Search Engine

Project Summary

For this project, we were recreating crawling, indexing, and querying as present in most modern search engines. The three modules, crawler, indexer, and querier work collectively to scrape word frequency data from a starting webpage down to a defined depth. The data is stored and then may be queried with responsive searches which allow for useful operators like AND and OR. It ranks the pages according to the relevance to the search.

The search engine is written in C. The crawler scrapes html and stores unique and unseen pages as files in the chosen file directory. It includes a unique ID number as the file name, and the file contents include the URL, the depth, and the page contents. The indexer

The puzzle solution and generation was written in C. To extend the project beyond the basic scope, one of my team members also wrote a front and backend for a web-based GUI, shown to the left, which was written in Node and React.