]> git.phdru.name Git - bookmarks_db.git/history - Robots/parse_html.py
Always log guessed charset even if it's utf-8.
[bookmarks_db.git] / Robots / parse_html.py
2008-03-03 Oleg BroytmanAlways log guessed charset even if it's utf-8.
2008-03-03 Oleg BroytmanCharset was guessed if it is not from META and not...
2008-03-03 Oleg BroytmanCreate the list of charsets outside of the parsers...
2008-02-25 Oleg Broytman  is an entity that needs to be encoded.
2008-02-24 Oleg BroytmanUsed name2codepoint directly; recode it.
2008-02-24 Oleg BroytmanCombined two "if"s.
2008-02-24 Oleg BroytmanDo not unquote standard HTML entities.
2008-02-24 Oleg BroytmanEmulate log.
2008-02-23 Oleg BroytmanFixed a bug - break out of the loop after finding the...
2008-02-23 Oleg BroytmanIt is not HTTP charset, it is guessed charset.
2008-02-23 Oleg BroytmanTry a list of charsets, including the universal (utf...
2008-02-13 Oleg BroytmanStop meddling with cp1252.
2008-02-12 Oleg Broytmancurrent_charset is only needed in main.
2008-02-11 Oleg BroytmanRecode entities before num. entities.
2008-02-11 Oleg BroytmanSwitched to utf-8.
2008-02-11 Oleg BroytmanRecode HTML entities.
2007-12-28 Oleg BroytmanDo not display too much titles if they are equal.
2007-12-27 Oleg BroytmanStrip every line in title.
2007-12-22 Oleg BroytmanDo not encode non-encodeable entities.
2007-12-18 Oleg BroytmanFixed a bug.
2007-12-18 Oleg BroytmanDo all manipulations with title in one place.
2007-12-18 Oleg BroytmanLog the module's name of the failed parse_html.
2007-12-18 Oleg BroytmanTry BeautifulSoup; if it fails - fall back to HTML...
2007-12-18 Oleg BroytmanRecode from DEFAULT_CHARSET if recoding from cp1252...
2007-12-16 Oleg BroytmanAdded parser for html based on BeautifulSoup.
2007-12-16 Oleg BroytmanSplit parse_html.py into parse_html_htmlparser.py.
2007-10-11 Oleg BroytmanFixed a bug: import sys.
2007-10-11 Oleg BroytmanIgnore case for comparison.
2007-10-10 Oleg BroytmanFixed a bug: import codecs.
2007-10-10 Oleg BroytmanUse m_lib.defenc.
2007-10-10 Oleg BroytmanIn case of unknown charset try charset from HTML.
2007-10-10 Oleg BroytmanInitialize parser.icon in case there are no <link>...
2007-09-25 Oleg BroytmanFind an icon's URL in the HTML.
2007-09-07 Oleg Broytman/usr/bin/env python
2005-01-29 Oleg BroytmanIf sys.getdefaultencoding() returns "ascii" - use
2003-07-28 Oleg BroytmanUpdated to m_lib version 1.2. Extended support for...
2003-07-24 Oleg BroytmanParse and recode unicode entities.
2003-07-24 Oleg BroytmanVersion 3.3.1.