- Parse downloaded file and get some additional information out of headers
- and parsed data - title, for example. Or redirects using <META HTTP-Equiv>.
- (Partially done - now extracting title).
+ Cleanup HTML before parsing using BeautifulSoap or Tidy.
+ Parse downloaded file and get javascript redirects.
- Documentation.
+ More and better documentation.
Merge "writers" to storage managers.
New storage managers: shelve, SQL, ZODB, MetaKit.
- Robots (URL checkers): threading, asyncore-based.
- Aliases in bookmarks.html.
+ More robots (URL checkers): threading, asyncore-based.
- Configuration file for configuring defaults - global defaults for the system
+ Configuration file to configure defaults - global defaults for the system
and local defaults for subsystems.
Ruleset-based mechanisms to filter out what types of URLs to check: checking
based on URL schema, host, port, path, filename, extension, etc.
- Detailed reports on robot run - what's old, what's new, what was moved,
+ Detailed reports on robot run - what's old, what's new, what has been moved,
errors, etc.
WWW-interface to the report.
- Bigger database. Multiuser database. Robot should operate on a part of
+ Bigger database. Multiuser database. Robot should operates on a part of
the DB.
- WWW-interface to the database. User will import/export/edit bookmarks,
+ WWW-interface to the database. User should import/export/edit bookmarks,
schedule robot run, etc.