A UCI domain related web search. Migrated from github.
Go to file
2022-05-27 23:00:45 -07:00
templates Everything done and ready to test 2022-05-27 23:00:45 -07:00
test Implemented all necessary indexer informations 2022-05-27 06:29:48 -07:00
__init__.py First pushed, setup all the stuff we need, no launcher yet. So test your code in another place for now, because they are all codepended on each others ... 2022-05-04 12:22:20 -07:00
.gitignore Added functionality of creating the index through the html 2022-05-27 17:39:34 -07:00
docs.weight Basic web-gui 2022-05-27 17:01:35 -07:00
importanttext.py added important tokens 2022-05-06 17:19:37 -07:00
indexer.py Everything done and ready to test 2022-05-27 23:00:45 -07:00
launcher.py Everything done and ready to test 2022-05-27 23:00:45 -07:00
merged_index.full Implemented all necessary indexer informations 2022-05-27 06:29:48 -07:00
merged_index.index Implemented all necessary indexer informations 2022-05-27 06:29:48 -07:00
mytest.py tf-idf ngrams and now returns dict rather than 2022-05-11 14:46:32 -07:00
posting.py Fully changed indexer and worker classes with properly indexing 2022-05-27 05:11:01 -07:00
README.md Everything done and ready to test 2022-05-27 23:00:45 -07:00
requirements.txt Added functionality of creating the index through the html 2022-05-27 17:39:34 -07:00
search.py Everything done and ready to test 2022-05-27 23:00:45 -07:00
test_merge.py Fully changed indexer and worker classes with properly indexing 2022-05-27 05:11:01 -07:00
test.py Implemented all necessary indexer informations 2022-05-27 06:29:48 -07:00
testfile.json added important tokens 2022-05-06 17:18:34 -07:00
worker.py We are looking for TF_WEIGHT not IDF_WEIGHT, make things A LOT CHEAPER 2022-05-27 10:39:13 -07:00

Search_Engine

Developing a mini search-engine in python using reverse-indexed stemming and other SEOs implementations Start the program by running python3 launcher.py A flask webpage will start. If you do not have any indexes files, the webpage will show you an error There is a button at the top of the page called Run Indexer THIS IS EXTREMELY TIME CONSUMING AND DANGEROUS. IT WILL DELETE THE INDEX IF YOU ALREADY HAVE ONE ! So to safeguard this, you have to click the button five times in a row in five different refreshes of the page

You can also create the index by running python3 indexer.py

After the indices are created you can go ahead and search through them

Notably