Bookmarks Database and Internet Robot WHAT IS IT There is a set of classes, libraries, programs and plugins I use to manipulate my bookmarks.html. I like Netscape Navigator, but I need more features, so I write and maintain these programs for my needs. I need to extend Navigator's "What's new" feature (Navigator 4 calls it "Update bookmarks"). WHAT'S NEW in version 3.4.0 (2004-07-27) Updated to m_lib version 1.2. Extended support for Mozilla; keywords in bookmarks. WHAT'S NEW in version 3.3.2 parse_html.py can now recode unicode entities in titles. WHAT'S NEW in version 3.3.0 Required Python 2.2. HTML parser. If the protocol is HTTP, and there is Content-Type header, and content type is text/html, the object is parsed to extract its title; if the Content-Type header has charset, or if the HTML has with charset, the title is converted from the given charset to the default charset. The object is also parsed to extract tag with redirect. WHAT'S NEW in version 3.0 Complete rewrite from scratch. Created mechanism for pluggable storage managers, writers (DB dumpers/exporters) and robots. WHERE TO GET Master site: http://phd.pp.ru/Software/Python/#bookmarks_db Faster mirrors: http://phd.by.ru/Software/Python/#bookmarks_db http://phd2.chat.ru/Software/Python/#bookmarks_db AUTHOR Oleg Broytmann COPYRIGHT Copyright (C) 1997-2002 PhiloSoft Design LICENSE GPL STATUS Storage managers: pickle, FLAD (Flat ASCII Database). Writers: HTML, text, FLAD (full database or only errors). Robots (URL checker): simple, simple+timeoutscoket, forking. TODO Parse downloaded file and get some additional information out of headers and parsed data - title, for example. Or redirects using . (Partially done - now extracting title). Documentation. Merge "writers" to storage managers. New storage managers: shelve, SQL, ZODB, MetaKit. Robots (URL checkers): threading, asyncore-based. Aliases in bookmarks.html. Configuration file for configuring defaults - global defaults for the system and local defaults for subsystems. Ruleset-based mechanisms to filter out what types of URLs to check: checking based on URL schema, host, port, path, filename, extension, etc. Detailed reports on robot run - what's old, what's new, what was moved, errors, etc. WWW-interface to the report. Bigger database. Multiuser database. Robot should operate on a part of the DB. WWW-interface to the database. User will import/export/edit bookmarks, schedule robot run, etc.