X-Git-Url: https://git.phdru.name/?a=blobdiff_plain;f=doc%2FTODO;h=e4fe65203b33b6bb9f1f6a9652b3e97312e13f82;hb=b65d59b866ec851da9871afe9707f6b1e38866be;hp=a74684b467cee497e87c307fb42f3bbd859812b5;hpb=037c50a4a20df82a51375d9fcb075d4ac5add0b8;p=bookmarks_db.git diff --git a/doc/TODO b/doc/TODO index a74684b..e4fe652 100644 --- a/doc/TODO +++ b/doc/TODO @@ -1,23 +1,43 @@ - Cleanup HTML before parsing using BeautifulSoap or Tidy. - Parse downloaded file and get javascript redirects. +HTML parser based on BeautifulSoup4. Bs3 for Python 2, bs4 for Py3. - More and better documentation. +Replace subproc.py with some IPC. Or update for Python 3. - Merge "writers" to storage managers. - New storage managers: shelve, SQL, ZODB, MetaKit. - More robots (URL checkers): threading, asyncore-based. +Python 3. - Configuration file to configure defaults - global defaults for the system - and local defaults for subsystems. +Forbid external names to resolve to internal addresses (127.0.0.1, etc). - Ruleset-based mechanisms to filter out what types of URLs to check: checking - based on URL schema, host, port, path, filename, extension, etc. +Configuration file to configure defaults - global defaults for the system +and local defaults for subsystems. - Detailed reports on robot run - what's old, what's new, what has been moved, - errors, etc. - WWW-interface to the report. +Robot based on PycURL. - Bigger database. Multiuser database. Robot should operates on a part of - the DB. - WWW-interface to the database. User should import/export/edit bookmarks, - schedule robot run, etc. +Robot based on Scrapy. + +A program to publish bookmarks with icons. + +Fetch description from and store it in +bookmark.description if the description is empty. (How to update old +descriptions without replacing my own comments?) + +Parse (or interpret) downloaded file and get javascript redirects. + +More and better documentation. + +Merge "writers" to storage managers. +New storage managers: shelve, SQL, ZODB, MetaKit. +More robots (URL checkers): threading, asyncore-based; +robots that test many URLs in parallel. + +Ruleset-based mechanisms to filter out what types of URLs to check: checking +based on URL schema, host, port, path, filename, extension, etc. + +Detailed reports on robot run - what's old, what's new, what has been moved, +errors, etc. +WWW-interface to the report. + +Bigger database. Multiuser database. Robot should operates on a part of +the DB. +WWW-interface to the database. User should import/export/edit bookmarks, +schedule robot run, etc. + +A program to collect and check links from a site.