X-Git-Url: https://git.phdru.name/?a=blobdiff_plain;ds=sidebyside;f=doc%2FTODO;h=e4fe65203b33b6bb9f1f6a9652b3e97312e13f82;hb=b65d59b866ec851da9871afe9707f6b1e38866be;hp=0ada41e7661c5965692fcd48ebe863208dcee8b5;hpb=dc704d6e4362a5f48420e5f991519330bf837ded;p=bookmarks_db.git diff --git a/doc/TODO b/doc/TODO index 0ada41e..e4fe652 100644 --- a/doc/TODO +++ b/doc/TODO @@ -1,17 +1,33 @@ -Get and store icon. +HTML parser based on BeautifulSoup4. Bs3 for Python 2, bs4 for Py3. -Cleanup HTML before parsing using BeautifulSoap or Tidy. -Parse downloaded file and get javascript redirects. +Replace subproc.py with some IPC. Or update for Python 3. -More and better documentation. +Python 3. -Merge "writers" to storage managers. -New storage managers: shelve, SQL, ZODB, MetaKit. -More robots (URL checkers): threading, asyncore-based. +Forbid external names to resolve to internal addresses (127.0.0.1, etc). Configuration file to configure defaults - global defaults for the system and local defaults for subsystems. +Robot based on PycURL. + +Robot based on Scrapy. + +A program to publish bookmarks with icons. + +Fetch description from and store it in +bookmark.description if the description is empty. (How to update old +descriptions without replacing my own comments?) + +Parse (or interpret) downloaded file and get javascript redirects. + +More and better documentation. + +Merge "writers" to storage managers. +New storage managers: shelve, SQL, ZODB, MetaKit. +More robots (URL checkers): threading, asyncore-based; +robots that test many URLs in parallel. + Ruleset-based mechanisms to filter out what types of URLs to check: checking based on URL schema, host, port, path, filename, extension, etc. @@ -23,3 +39,5 @@ Bigger database. Multiuser database. Robot should operates on a part of the DB. WWW-interface to the database. User should import/export/edit bookmarks, schedule robot run, etc. + +A program to collect and check links from a site.