]> git.phdru.name Git - bookmarks_db.git/log
bookmarks_db.git
10 months agoFeat: Dropped support for Python 2
Oleg Broytman [Thu, 8 Aug 2024 04:45:58 +0000 (07:45 +0300)]
Feat: Dropped support for Python 2

10 months agoFeat(Robots): Remove urllib-based robots
Oleg Broytman [Thu, 8 Aug 2024 04:31:23 +0000 (07:31 +0300)]
Feat(Robots): Remove urllib-based robots

10 months agoStyle(Writers/bkmk_wflad): Rename loop variables
Oleg Broytman [Wed, 7 Aug 2024 22:17:09 +0000 (01:17 +0300)]
Style(Writers/bkmk_wflad): Rename loop variables

10 months agoRefactor: Extract the common list of attributes; `copy_bkmk()`
Oleg Broytman [Wed, 7 Aug 2024 22:14:28 +0000 (01:14 +0300)]
Refactor: Extract the common list of attributes; `copy_bkmk()`

10 months agoFeat(check_urls): Separately report redirects
Oleg Broytman [Wed, 7 Aug 2024 21:51:21 +0000 (00:51 +0300)]
Feat(check_urls): Separately report redirects

10 months agoFeat(Robots): Stop the robot ASAP
Oleg Broytman [Wed, 7 Aug 2024 17:00:53 +0000 (20:00 +0300)]
Feat(Robots): Stop the robot ASAP

Process bookmarks after the robot stopped. In the future
there will be robots that check multiple URLs in parallel
so bookmarks cannot be processed inside check loop
but can be queried after.

10 months agoStyle(Robots): Rename `check_url` to `check_bookmark`
Oleg Broytman [Wed, 7 Aug 2024 15:28:49 +0000 (18:28 +0300)]
Style(Robots): Rename `check_url` to `check_bookmark`

Also rename `smart_get` to `get_url`.

10 months agoFix(bkmk-add): Stop the robot
Oleg Broytman [Tue, 6 Aug 2024 17:52:43 +0000 (20:52 +0300)]
Fix(bkmk-add): Stop the robot

10 months agoStyle(bkmk-add): Rename `_robot` -> `robot`
Oleg Broytman [Tue, 6 Aug 2024 16:18:36 +0000 (19:18 +0300)]
Style(bkmk-add): Rename `_robot` -> `robot`

10 months agoFeat(Robots): Do not return error from `check_url()`
Oleg Broytman [Tue, 6 Aug 2024 16:01:37 +0000 (19:01 +0300)]
Feat(Robots): Do not return error from `check_url()`

Break the entire program with `Ctrl-C`.

10 months agoFeat(bkmk_rrequests): Install socks dependency
Oleg Broytman [Tue, 6 Aug 2024 13:43:58 +0000 (16:43 +0300)]
Feat(bkmk_rrequests): Install socks dependency

10 months agoFeat(bkmk_raiohttp): Use aioftp
Oleg Broytman [Tue, 6 Aug 2024 11:27:46 +0000 (14:27 +0300)]
Feat(bkmk_raiohttp): Use aioftp

10 months agoFix(Robots): Do not route ftp requests via http(s) proxy
Oleg Broytman [Tue, 6 Aug 2024 10:07:50 +0000 (13:07 +0300)]
Fix(Robots): Do not route ftp requests via http(s) proxy

socks5 proxies are ok.

10 months agoFeat(Robots): Robot based on aiohttp 5.5.0
Oleg Broytman [Mon, 5 Aug 2024 12:00:55 +0000 (15:00 +0300)]
Feat(Robots): Robot based on aiohttp

10 months agoFeat(Writers/bkmk_whtml): Mark special folders
Oleg Broytman [Tue, 6 Aug 2024 06:16:46 +0000 (09:16 +0300)]
Feat(Writers/bkmk_whtml): Mark special folders

10 months agoFeat: Delete `root_folder.linear` before storing
Oleg Broytman [Mon, 5 Aug 2024 20:19:03 +0000 (23:19 +0300)]
Feat: Delete `root_folder.linear` before storing

10 months agoFix(bkmk_robot_base): Convert environment parameters to integer
Oleg Broytman [Mon, 5 Aug 2024 15:19:26 +0000 (18:19 +0300)]
Fix(bkmk_robot_base): Convert environment parameters to integer

10 months agoFeat(bkmk_robot_base) Cut long `data:` icon URLs for logs
Oleg Broytman [Mon, 5 Aug 2024 14:33:33 +0000 (17:33 +0300)]
Feat(bkmk_robot_base) Cut long `data:` icon URLs for logs

10 months agoFix(bkmk_rcurl): IDNA-encode URLs 5.4.1
Oleg Broytman [Mon, 5 Aug 2024 12:56:23 +0000 (15:56 +0300)]
Fix(bkmk_rcurl): IDNA-encode URLs

PycURL doesn't encode URLs itself
and requires URLs to be in ASCII encoding.

10 months agoRefactor(Robots): Connect timeout
Oleg Broytman [Mon, 5 Aug 2024 10:41:08 +0000 (13:41 +0300)]
Refactor(Robots): Connect timeout

10 months agoVersion 5.4.0: Robot based on PycURL 5.4.0
Oleg Broytman [Fri, 2 Aug 2024 10:42:11 +0000 (13:42 +0300)]
Version 5.4.0: Robot based on PycURL

10 months agoBuild: Add `devscripts`
Oleg Broytman [Fri, 2 Aug 2024 10:44:15 +0000 (13:44 +0300)]
Build: Add `devscripts`

10 months agoFix(bkmk_robot_base): Fix reporting proxy error
Oleg Broytman [Thu, 1 Aug 2024 16:10:30 +0000 (19:10 +0300)]
Fix(bkmk_robot_base): Fix reporting proxy error

10 months agoFeat(Robots): Update X-User-Agent header
Oleg Broytman [Thu, 1 Aug 2024 10:03:16 +0000 (13:03 +0300)]
Feat(Robots): Update X-User-Agent header

10 months agoFeat(Robots): Upgrade headers
Oleg Broytman [Thu, 1 Aug 2024 10:03:03 +0000 (13:03 +0300)]
Feat(Robots): Upgrade headers

10 months agoDocs(bkmk_robot_base): List robots
Oleg Broytman [Thu, 1 Aug 2024 08:40:32 +0000 (11:40 +0300)]
Docs(bkmk_robot_base): List robots

10 months agoFix(bkmk_rrequests): Check `r is not None`
Oleg Broytman [Thu, 1 Aug 2024 07:26:13 +0000 (10:26 +0300)]
Fix(bkmk_rrequests): Check `r is not None`

It seems `if r` is not enough --
`bool(r)` returns `False` in case there was an HTTP error.

10 months agoStyle(setup.py): Remove unused import
Oleg Broytman [Thu, 1 Aug 2024 05:06:27 +0000 (08:06 +0300)]
Style(setup.py): Remove unused import

10 months agoFeat(get_url): Parse args, save/print headers/body
Oleg Broytman [Thu, 1 Aug 2024 04:34:11 +0000 (07:34 +0300)]
Feat(get_url): Parse args, save/print headers/body

10 months agoFeat(robots): Report robot being used
Oleg Broytman [Thu, 1 Aug 2024 04:19:34 +0000 (07:19 +0300)]
Feat(robots): Report robot being used

10 months agoFeat(bkmk_robot_base): Report error on getting icon
Oleg Broytman [Wed, 31 Jul 2024 22:47:45 +0000 (01:47 +0300)]
Feat(bkmk_robot_base): Report error on getting icon

10 months agoFeat(get_url): Print headers
Oleg Broytman [Wed, 31 Jul 2024 22:25:07 +0000 (01:25 +0300)]
Feat(get_url): Print headers

10 months agoUpdate docs
Oleg Broytman [Wed, 31 Jul 2024 18:21:48 +0000 (21:21 +0300)]
Update docs

10 months agoFear(robots): Try robots from a list
Oleg Broytman [Wed, 31 Jul 2024 18:12:59 +0000 (21:12 +0300)]
Fear(robots): Try robots from a list

Default list is curl,requests,forking.

10 months agoFeat(Robots): Robot based on PycURL
Oleg Broytman [Wed, 31 Jul 2024 17:29:29 +0000 (20:29 +0300)]
Feat(Robots): Robot based on PycURL

10 months agoFix(bkmk_robot_base): Do not pass `localhost` via proxy
Oleg Broytman [Wed, 31 Jul 2024 16:23:22 +0000 (19:23 +0300)]
Fix(bkmk_robot_base): Do not pass `localhost` via proxy

10 months agoFeat(bkmk_rurllib): Use proxy
Oleg Broytman [Wed, 31 Jul 2024 16:22:58 +0000 (19:22 +0300)]
Feat(bkmk_rurllib): Use proxy

10 months agoStyle(bkmk_rurllib2): Remove unused import
Oleg Broytman [Wed, 31 Jul 2024 16:21:58 +0000 (19:21 +0300)]
Style(bkmk_rurllib2): Remove unused import

Found by `flake8`.

10 months agoRefactor(Robots): Move proxy handling to base class
Oleg Broytman [Wed, 31 Jul 2024 15:49:11 +0000 (18:49 +0300)]
Refactor(Robots): Move proxy handling to base class

This greatly simplifies robots.

10 months agoFeat(Robots): Return HTTP status code
Oleg Broytman [Wed, 31 Jul 2024 15:14:05 +0000 (18:14 +0300)]
Feat(Robots): Return HTTP status code

11 months agoDocs(TODO): Robot(s) that test many URLs in parallel
Oleg Broytman [Fri, 26 Jul 2024 10:12:14 +0000 (13:12 +0300)]
Docs(TODO): Robot(s) that test many URLs in parallel

Increase task priority.

11 months agoDocs(TODO): Robot based on aiohttp
Oleg Broytman [Fri, 26 Jul 2024 10:11:22 +0000 (13:11 +0300)]
Docs(TODO): Robot based on aiohttp

11 months agoAdd `setup.cfg` and `setup.py`
Oleg Broytman [Fri, 26 Jul 2024 01:32:50 +0000 (04:32 +0300)]
Add `setup.cfg` and `setup.py`

Mostly to list required and optional dependencies.

11 months agoFix(bkmk_db-venv): Do not exit
Oleg Broytman [Wed, 24 Jul 2024 21:01:18 +0000 (00:01 +0300)]
Fix(bkmk_db-venv): Do not exit

This is not a shell script, this is a sourced file.

11 months agoChore: Rename `bkmk-venv` to `bkmk_db-venv`
Oleg Broytman [Wed, 24 Jul 2024 20:44:41 +0000 (23:44 +0300)]
Chore: Rename `bkmk-venv` to `bkmk_db-venv`

11 months agoChore(bkmk-venv): Rename `.venv` to `bkmk_db-venv`
Oleg Broytman [Wed, 24 Jul 2024 20:43:45 +0000 (23:43 +0300)]
Chore(bkmk-venv): Rename `.venv` to `bkmk_db-venv`

11 months agoFeat: Cleanup redirects
Oleg Broytman [Wed, 24 Jul 2024 02:26:03 +0000 (05:26 +0300)]
Feat: Cleanup redirects

Remove verbiage.

11 months agoFeat: Skip URLs that have '%s'
Oleg Broytman [Wed, 24 Jul 2024 02:05:56 +0000 (05:05 +0300)]
Feat: Skip URLs that have '%s'

11 months agoFix: These are not errors, just duplicates
Oleg Broytman [Wed, 24 Jul 2024 01:47:39 +0000 (04:47 +0300)]
Fix: These are not errors, just duplicates

11 months agoBuild(Robots/bkmk_rrequests): Use HTTP(S) proxy instead of SOCKS5
Oleg Broytman [Tue, 23 Jul 2024 10:08:11 +0000 (13:08 +0300)]
Build(Robots/bkmk_rrequests): Use HTTP(S) proxy instead of SOCKS5

15 months agoFix(Robot): Stop splitting and un-splitting URLs 5.3.1
Oleg Broytman [Wed, 6 Mar 2024 15:43:48 +0000 (18:43 +0300)]
Fix(Robot): Stop splitting and un-splitting URLs

Pass `bookmark.href` as is.

15 months agoFix(get_url): Remove excessive printing
Oleg Broytman [Wed, 6 Mar 2024 15:36:17 +0000 (18:36 +0300)]
Fix(get_url): Remove excessive printing

`robot.get()` doesn't really fill the bookmarks,
`robot.check_url()` does but we don't call it here.

15 months agoRename `check_url.py` to `check_urls.py`
Oleg Broytman [Wed, 6 Mar 2024 15:35:03 +0000 (18:35 +0300)]
Rename `check_url.py` to `check_urls.py`

15 months agoRename `check_urls.py` to `check_urls_db.py`
Oleg Broytman [Wed, 6 Mar 2024 15:32:22 +0000 (18:32 +0300)]
Rename `check_urls.py` to `check_urls_db.py`

15 months agoVersion 5.3.0 5.3.0
Oleg Broytman [Tue, 5 Mar 2024 23:48:48 +0000 (02:48 +0300)]
Version 5.3.0

   Added get_url.py: a script to get one file from an URL.
   Renamed set-URLs -> set-urls.

15 months agoAdd `get_url.py`: a script to get one file from an URL
Oleg Broytman [Tue, 5 Mar 2024 23:47:23 +0000 (02:47 +0300)]
Add `get_url.py`: a script to get one file from an URL

15 months agoRename set-URLs -> set-urls
Oleg Broytman [Tue, 5 Mar 2024 23:33:09 +0000 (02:33 +0300)]
Rename set-URLs -> set-urls

15 months agoVersion 5.2.5 5.2.5
Oleg Broytman [Tue, 5 Mar 2024 23:24:17 +0000 (02:24 +0300)]
Version 5.2.5

   Feat(Robots/bkmk_rrequests): Ignore all problems with certificates.
   Fix(Robots/bkmk_robot_base): Pass query part.

15 months agoFix(Robots/bkmk_robot_base): Pass query part
Oleg Broytman [Tue, 5 Mar 2024 20:22:39 +0000 (23:22 +0300)]
Fix(Robots/bkmk_robot_base): Pass query part

15 months agoFeat(Robots/bkmk_rrequests): Ignore all problems with certificates
Oleg Broytman [Tue, 5 Mar 2024 20:14:47 +0000 (23:14 +0300)]
Feat(Robots/bkmk_rrequests): Ignore all problems with certificates

Drop SSL/TLS security to the lowest level.
I want to get the pages at all cost.
Unmatched names, expired certificates,
small DH values are less of a concern for me
comparing with DNS errors and connection timeouts.

15 months agoVersion 5.2.4: No need to re-check error 404 via proxy 5.2.4
Oleg Broytman [Mon, 4 Mar 2024 15:15:04 +0000 (18:15 +0300)]
Version 5.2.4: No need to re-check error 404 via proxy

15 months agoFix(Robots/bkmk_rrequests): Add forgotten spaces in log
Oleg Broytman [Mon, 4 Mar 2024 15:13:13 +0000 (18:13 +0300)]
Fix(Robots/bkmk_rrequests): Add forgotten spaces in log

15 months agoFix(Robots/bkmk_rrequests): No need to re-check error 404 via proxy
Oleg Broytman [Mon, 4 Mar 2024 10:48:26 +0000 (13:48 +0300)]
Fix(Robots/bkmk_rrequests): No need to re-check error 404 via proxy

15 months agoVersion 5.2.3 5.2.3
Oleg Broytman [Sun, 3 Mar 2024 20:49:55 +0000 (23:49 +0300)]
Version 5.2.3

Feat(Robots/bkmk_rrequests): Report 40x and 50x errors.
Fix HTML pasrer based on Bs4: Find "shortcut icon".

15 months agoFeat(Robots/bkmk_rrequests): Report 40x and 50x errors
Oleg Broytman [Sun, 3 Mar 2024 20:41:54 +0000 (23:41 +0300)]
Feat(Robots/bkmk_rrequests): Report 40x and 50x errors

15 months agoFeat(Robots/bkmk_rrequests): Change error message
Oleg Broytman [Sun, 3 Mar 2024 20:31:44 +0000 (23:31 +0300)]
Feat(Robots/bkmk_rrequests): Change error message

15 months agoFix(parse_html/bkmk_ph_beautifulsoup4): Find "shortcut icon"
Oleg Broytman [Sun, 3 Mar 2024 14:47:58 +0000 (17:47 +0300)]
Fix(parse_html/bkmk_ph_beautifulsoup4): Find "shortcut icon"

Bs4 splits attribute values. To fix it the value must be re-combined back.

15 months agoFix(Robots/bkmk_robot_base): Add forgotten spaces in log
Oleg Broytman [Sun, 3 Mar 2024 14:24:52 +0000 (17:24 +0300)]
Fix(Robots/bkmk_robot_base): Add forgotten spaces in log

15 months agoVersion 5.2.2 5.2.2
Oleg Broytman [Sun, 3 Mar 2024 10:29:06 +0000 (13:29 +0300)]
Version 5.2.2

   Robots/bkmk_rrequests: Add request headers.
   Robots/bkmk_robot_base: Process "data:image/" icons.

15 months agoFeat(Robots/bkmk_robot_base): Process "data:image/" icons
Oleg Broytman [Sun, 3 Mar 2024 10:22:48 +0000 (13:22 +0300)]
Feat(Robots/bkmk_robot_base): Process "data:image/" icons

15 months agoFeat(Robots/bkmk_rrequests): Add request headers
Oleg Broytman [Sun, 3 Mar 2024 10:10:13 +0000 (13:10 +0300)]
Feat(Robots/bkmk_rrequests): Add request headers

15 months agoRefactor(Robots): Refactor request headers
Oleg Broytman [Sun, 3 Mar 2024 09:48:11 +0000 (12:48 +0300)]
Refactor(Robots): Refactor request headers

15 months agoStyle(Robots/bkmk_rurllib_py3): Remove unused variable
Oleg Broytman [Sun, 3 Mar 2024 09:47:40 +0000 (12:47 +0300)]
Style(Robots/bkmk_rurllib_py3): Remove unused variable

15 months agoFix(Robots/bkmk_robot_base): Ignore unknown charset
Oleg Broytman [Sat, 2 Mar 2024 13:28:46 +0000 (16:28 +0300)]
Fix(Robots/bkmk_robot_base): Ignore unknown charset

There are sites that provide incorrect
(most probably misspelled) charset.

15 months agoFix(Robots/bkmk_robot_base): Add forgotten space in log 5.2.1
Oleg Broytman [Sat, 2 Mar 2024 09:28:34 +0000 (12:28 +0300)]
Fix(Robots/bkmk_robot_base): Add forgotten space in log

15 months agoPerf(Rebobt/requests): Speedup second access
Oleg Broytman [Sat, 2 Mar 2024 09:13:42 +0000 (12:13 +0300)]
Perf(Rebobt/requests): Speedup second access

Use proxy immediately for hosts
for which we already know they require proxy.

Don't use proxy for hosts that aren't accessible even through proxy,
immediately return an error.

15 months agoRefactor(Rebobt/requests)
Oleg Broytman [Fri, 1 Mar 2024 21:02:57 +0000 (00:02 +0300)]
Refactor(Rebobt/requests)

15 months agoFeat: For the robot based on requests allow to use a proxy 5.2.0
Oleg Broytman [Fri, 1 Mar 2024 20:57:56 +0000 (23:57 +0300)]
Feat: For the robot based on requests allow to use a proxy

15 months agoFeat: Robot based on requests 5.1.0
Oleg Broytman [Wed, 28 Feb 2024 21:18:38 +0000 (00:18 +0300)]
Feat: Robot based on requests

15 months agoFeat(venv): Use `venv` if `virtualenv` is not available
Oleg Broytman [Wed, 28 Feb 2024 19:03:42 +0000 (22:03 +0300)]
Feat(venv): Use `venv` if `virtualenv` is not available

19 months agoFix(Py3): Use `urllib.parse.urlsplit()`
Oleg Broytman [Tue, 28 Nov 2023 17:04:18 +0000 (20:04 +0300)]
Fix(Py3): Use `urllib.parse.urlsplit()`

19 months agoRelease 5.0.0 5.0.0
Oleg Broytman [Wed, 22 Nov 2023 16:09:56 +0000 (19:09 +0300)]
Release 5.0.0

19 months agoDocs: Update
Oleg Broytman [Wed, 22 Nov 2023 16:09:45 +0000 (19:09 +0300)]
Docs: Update

19 months agoFix(Py3): Open list of titles in UTF-8
Oleg Broytman [Tue, 21 Nov 2023 18:47:34 +0000 (21:47 +0300)]
Fix(Py3): Open list of titles in UTF-8

19 months agoFix(Py3): Always open text storage files in UTF-8
Oleg Broytman [Tue, 21 Nov 2023 18:46:42 +0000 (21:46 +0300)]
Fix(Py3): Always open text storage files in UTF-8

19 months agoFix(Py3): Always log in UTF-8
Oleg Broytman [Mon, 20 Nov 2023 20:58:14 +0000 (23:58 +0300)]
Fix(Py3): Always log in UTF-8

19 months agoFix(Py3): `html.parser` cannot parse bytes
Oleg Broytman [Mon, 20 Nov 2023 17:49:22 +0000 (20:49 +0300)]
Fix(Py3): `html.parser` cannot parse bytes

Decode to unicode from a known encoding.

19 months agoFix(Py3): Decode content using HTTP chrset
Oleg Broytman [Mon, 20 Nov 2023 17:34:36 +0000 (20:34 +0300)]
Fix(Py3): Decode content using HTTP chrset

19 months agoFix(Py3): `urllib` writes its files as bytes
Oleg Broytman [Mon, 20 Nov 2023 17:33:42 +0000 (20:33 +0300)]
Fix(Py3): `urllib` writes its files as bytes

19 months agoFix(parse_html CLI): Report encodings and the title
Oleg Broytman [Mon, 20 Nov 2023 16:21:22 +0000 (19:21 +0300)]
Fix(parse_html CLI): Report encodings and the title

19 months agoFix(parse_html/bkmk_parse_html.py): Open the file with known encoding
Oleg Broytman [Mon, 20 Nov 2023 16:20:31 +0000 (19:20 +0300)]
Fix(parse_html/bkmk_parse_html.py): Open the file with known encoding

19 months agoFix(parse_html/bkmk_ph_beautifulsoup4): Fix title recombination
Oleg Broytman [Mon, 20 Nov 2023 01:12:54 +0000 (04:12 +0300)]
Fix(parse_html/bkmk_ph_beautifulsoup4): Fix title recombination

19 months agoFix(Py3): Remove forgotten `.decode()`/`.encode()`
Oleg Broytman [Mon, 20 Nov 2023 01:02:30 +0000 (04:02 +0300)]
Fix(Py3): Remove forgotten `.decode()`/`.encode()`

19 months agoFeat: Remove some HTML parsers
Oleg Broytman [Mon, 20 Nov 2023 00:50:26 +0000 (03:50 +0300)]
Feat: Remove some HTML parsers

EtreeTidy is outdated and buggy.
html5 is outdated.

19 months agoStyle: Fix `flake8` E501 line too long
Oleg Broytman [Mon, 20 Nov 2023 00:39:46 +0000 (03:39 +0300)]
Style: Fix `flake8` E501 line too long

19 months agoStyle: Fix `flake8` E402 module level import not at top of file
Oleg Broytman [Mon, 20 Nov 2023 00:38:00 +0000 (03:38 +0300)]
Style: Fix `flake8` E402 module level import not at top of file

19 months agoChore(venv): Only run `pip install` on fresh virtual env
Oleg Broytman [Mon, 20 Nov 2023 00:16:46 +0000 (03:16 +0300)]
Chore(venv): Only run `pip install` on fresh virtual env

19 months agoFeat(check_url.py): Print "Moved", "Size", "Md5"
Oleg Broytman [Mon, 20 Nov 2023 00:00:06 +0000 (03:00 +0300)]
Feat(check_url.py): Print "Moved", "Size", "Md5"

19 months agoFix(robots): Fix "Content-Length" header returning `None`
Oleg Broytman [Sun, 19 Nov 2023 23:58:53 +0000 (02:58 +0300)]
Fix(robots): Fix "Content-Length" header returning `None`

19 months agoFix(robots): Store charset
Oleg Broytman [Sat, 18 Nov 2023 16:47:22 +0000 (19:47 +0300)]
Fix(robots): Store charset