]> git.phdru.name Git - bookmarks_db.git/history - Robots/parse_html_beautifulsoup.py
Fixed encoding.
[bookmarks_db.git] / Robots / parse_html_beautifulsoup.py
2010-08-13 Oleg BroytmanFixed encoding.
2010-08-13 Oleg BroytmanFixed a bug - moved the code where meta_charset is...
2010-08-12 Oleg BroytmanFixed a bug - don't do a double encode.
2010-08-12 Oleg BroytmanTry parser in order until the first one finds a title.
2010-08-11 Oleg BroytmanMoved HTMLParser from parse_html_beautifulsoup.py to...
2010-08-11 Oleg Broytman2010.
2010-08-08 Oleg BroytmanFixed a bug.
2010-08-08 Oleg BroytmanFixed a bug - parse "HTTP-Equiv" without content.
2009-09-27 Oleg Broytman"BroytMann" => "Broytman".
2008-03-09 Oleg BroytmanTitle (and refresh) can be None.
2008-03-07 Oleg BroytmanSplit the title into subparts, reassemble the subparts...
2008-03-07 Oleg BroytmanLookup TITLE in HEAD, in HTML and in the root; test...
2008-03-04 Oleg BroytmanFull name for "IGNORECASE".
2008-03-04 Oleg BroytmanIgnore case for DOCTYPE.
2008-03-04 Oleg BroytmanCheck root.
2008-03-04 Oleg BroytmanReparse the HTML if the charset was changed.
2008-03-04 Oleg BroytmanI have never saw pages in MacCyriliic.
2008-03-04 Oleg BroytmanReplace ISO-8859-2 to the default encoding.
2008-03-04 Oleg BroytmanDo not log TypeError.
2008-03-03 Oleg BroytmanIn the default hierarchy "root > html > head > title...
2008-03-03 Oleg BroytmanLog more parsers errors.
2008-03-03 Oleg BroytmanFixed a bug in case there is no charset in META Content...
2008-03-03 Oleg BroytmanTest meta charset by looking in META HTTP-Equiv.
2008-02-13 Oleg BroytmanReplace BeautifulSoup's guessed cp1252 with DEFAULT_CHA...
2008-01-09 Oleg BroytmanDo the second check for title only if there is HEAD.
2008-01-08 Oleg BroytmanSome sites put TITLE in HTML outside of HEAD.
2008-01-08 Oleg BroytmanSome sites put TITLE in HTML without HEAD.
2008-01-08 Oleg BroytmanDo not return an empty string - pass it to BSoupParser.
2008-01-08 Oleg BroytmanIf there is HEAD but no TITLE - return empty title.
2007-12-22 Oleg BroytmanAlways close the input file.
2007-12-22 Oleg BroytmanAdded BadDeclParser.
2007-12-18 Oleg BroytmanTry BeautifulSoup; if it fails - fall back to HTML...
2007-12-18 Oleg BroytmanFixed a bug = meta_charset is True if HTTP charset...
2007-12-16 Oleg BroytmanDirect return.
2007-12-16 Oleg BroytmanCalculate if the charset came from HTTP or from HTML...
2007-12-16 Oleg BroytmanInherit HTMLParser (for unescape).
2007-12-16 Oleg BroytmanAdded parser for html based on BeautifulSoup.