Commit Graph

13 Commits

Author SHA1 Message Date
Hieuhuy Pham
5b0a9bfbe2 Git pushed after crawling #1 2022-04-25 20:19:40 -07:00
Hieuhuy Pham
c1b7a50460 Locks are not racing anymore, locks work multi-thread works, change some storing information stuff so its more readble, add some new regex but it will need to be trim later because it does not do its job 2022-04-23 18:49:24 -07:00
Hieuhuy Pham
74063e5d00 Fixed a lot of racing issues, there potentially could be a writer reader confusion type of thing, but it should not matter that much, as long as server is healthy we can let this bad boi lose 2022-04-23 02:13:12 -07:00
Hieuhuy Pham
90a5d16456 Load balancer installed, havent not been able to test yet 2022-04-22 16:51:32 -07:00
Hieuhuy Pham
8b96a7c9f7 More refinement of frontier and worker for delicious multi-threading 2022-04-21 21:08:23 -07:00
iNocturnis
58c923f075 Merge branch 'data_collection' 2022-04-21 20:44:18 -07:00
Hieuhuy Pham
9301bd5ebe More locks and sempahore refinement 2022-04-21 20:41:25 -07:00
unknown
754d3b4af6 (andy) first move recent discussed issue 2022-04-21 20:31:38 -07:00
Hieuhuy Pham
320fe26c23 Added basic multi-threading, reader-first implementation 2022-04-21 19:44:30 -07:00
Hieuhuy Pham
58d15918d5 Change more syntax to get data collection working, check extracturl and sorted links into sets instead of lists to signifcantly reduce url extractions 2022-04-20 04:03:58 -07:00
Hieuhuy Pham
d0dde4a4db Fixes error in syntax for new merged code from data collection branch, fixed 'infinite loop', added timers to measure performance of functions. 2022-04-20 03:52:14 -07:00
unknown
bdd61a373b Moved stuff out of scraper 2022-04-20 00:49:49 -07:00
iNocturnis
e19f68a6a6
Add files via upload
First Upload
2022-04-15 17:55:11 -07:00