X-Git-Url: https://git.phdru.name/?a=blobdiff_plain;f=doc%2FTODO;h=e4fe65203b33b6bb9f1f6a9652b3e97312e13f82;hb=b65d59b866ec851da9871afe9707f6b1e38866be;hp=a74684b467cee497e87c307fb42f3bbd859812b5;hpb=037c50a4a20df82a51375d9fcb075d4ac5add0b8;p=bookmarks_db.git

diff --git a/doc/TODO b/doc/TODO
index a74684b..e4fe652 100644
--- a/doc/TODO
+++ b/doc/TODO
@@ -1,23 +1,43 @@
-   Cleanup HTML before parsing  using BeautifulSoap or Tidy.
-   Parse downloaded file and get javascript redirects.
+HTML parser based on BeautifulSoup4. Bs3 for Python 2, bs4 for Py3.
 
-   More and better documentation.
+Replace subproc.py with some IPC. Or update for Python 3.
 
-   Merge "writers" to storage managers.
-   New storage managers: shelve, SQL, ZODB, MetaKit.
-   More robots (URL checkers): threading, asyncore-based.
+Python 3.
 
-   Configuration file to configure defaults - global defaults for the system
-   and local defaults for subsystems.
+Forbid external names to resolve to internal addresses (127.0.0.1, etc).
 
-   Ruleset-based mechanisms to filter out what types of URLs to check: checking
-   based on URL schema, host, port, path, filename, extension, etc.
+Configuration file to configure defaults - global defaults for the system
+and local defaults for subsystems.
 
-   Detailed reports on robot run - what's old, what's new, what has been moved,
-   errors, etc.
-   WWW-interface to the report.
+Robot based on PycURL.
 
-   Bigger database. Multiuser database. Robot should operates on a part of
-   the DB.
-   WWW-interface to the database. User should import/export/edit bookmarks,
-   schedule robot run, etc.
+Robot based on Scrapy.
+
+A program to publish bookmarks with icons.
+
+Fetch description from <META name="description" content="..."> and store it in
+bookmark.description if the description is empty. (How to update old
+descriptions without replacing my own comments?)
+
+Parse (or interpret) downloaded file and get javascript redirects.
+
+More and better documentation.
+
+Merge "writers" to storage managers.
+New storage managers: shelve, SQL, ZODB, MetaKit.
+More robots (URL checkers): threading, asyncore-based;
+robots that test many URLs in parallel.
+
+Ruleset-based mechanisms to filter out what types of URLs to check: checking
+based on URL schema, host, port, path, filename, extension, etc.
+
+Detailed reports on robot run - what's old, what's new, what has been moved,
+errors, etc.
+WWW-interface to the report.
+
+Bigger database. Multiuser database. Robot should operates on a part of
+the DB.
+WWW-interface to the database. User should import/export/edit bookmarks,
+schedule robot run, etc.
+
+A program to collect and check links from a site.