Commit Graph

39 Commits

Author SHA1 Message Date
Jan Bader
0eee28968f Move download all setting to config.py 2026-04-12 22:37:35 +02:00
Jan Bader
038d239033 simplify current year implementation 2026-04-12 21:00:13 +02:00
Jan Bader
395998de88 only get current year by default 2026-04-12 20:59:06 +02:00
Jan Bader
0ad9567cdc unescape html 2026-04-08 22:08:17 +02:00
Jan Bader
cf9b97699f allow spaces in filename
but replace newline and tab characters
2026-04-08 21:57:41 +02:00
Jan Bader
5a3d60e45e use date for invoice name 2026-04-08 09:57:54 +02:00
Jan Bader
c2bde3b192 manually replace escaped linebreaks in notes 2026-04-08 09:51:13 +02:00
Jan Bader
881f6f5bd7 remove redundant goto 2026-04-08 09:47:08 +02:00
Jan Bader
f703f39bcb make load more strict 2026-04-07 23:51:57 +02:00
Jan Bader
9a6372d5c0 do not wait for url on print page 2026-04-07 23:50:28 +02:00
Jan Bader
2b4e44bcdc Use different load wait 2026-04-07 23:50:14 +02:00
Jan Bader
d06321973c simplify b2 url 2026-04-07 23:46:27 +02:00
Jan Bader
441438070e improve export by waiting for invoice page 2026-04-07 23:40:27 +02:00
Jan Bader
a4cffaae21 skip login if not needed 2026-04-07 23:39:44 +02:00
Jan Bader
0b3e11d6a5 printpayment directly if no anchor 2026-04-07 23:37:48 +02:00
Jan Bader
1cbe80ac00 fetch b2 & groups and fetch all years 2026-04-07 23:35:38 +02:00
Jan Bader
03bb94db2d fix: deduping by url 2026-04-07 23:28:04 +02:00
Jan Bader
a6d725f12c handle all invoice links 2026-04-07 23:26:45 +02:00
Jan Bader
c493d21403 Handle print from other page 2026-04-07 23:26:31 +02:00
Jan Bader
2f7ada655e skip hidden and disabled controls 2026-04-07 23:18:51 +02:00
Jan Bader
6e29103f5f add overrides and remove vatid & type from fields to fill
those are configuration options on the page beforehand
2026-04-07 23:17:03 +02:00
Jan Bader
2fc484c7b7 add more options for links 2026-04-07 22:51:44 +02:00
Jan Bader
65a39c0ec4 keep browser open on failed login 2026-04-07 22:47:46 +02:00
Jan Bader
bd0a6bf9a9 Increase Timeout on login 2026-04-07 22:47:37 +02:00
Jan Bader
894b7cee54 use persistent store and implement login 2026-04-05 22:48:08 +02:00
Jan Bader
d01776e2ab add stealth 2026-04-05 22:22:49 +02:00
Jan Bader
c53ded0ed5 Retry 2026-04-05 22:18:24 +02:00
Jan Bader
a9bb2460c6 convert to backblaze fetcher 2026-04-05 22:01:50 +02:00
Jan Bader
66e1c9e0e0 Improve extraction 2026-04-04 21:16:28 +02:00
Jan Bader
40104dc0f9 Add logging 2026-04-04 21:10:33 +02:00
Jan Bader
684f7c87e6 Harder ignoring of ui prompts 2026-04-04 21:00:56 +02:00
Jan Bader
d32c696f6e Also use AI for the content 2026-04-04 20:56:44 +02:00
Jan Bader
1c719f4381 load .env in dev shell 2026-04-04 20:51:10 +02:00
Jan Bader
0163767dd1 Add AI summarization 2026-04-04 20:50:59 +02:00
Jan Bader
db44427c1f Ignore some ui prompts 2026-04-04 20:46:48 +02:00
Jan Bader
99ba4f6ac8 Ignore language list 2026-04-04 20:41:00 +02:00
Jan Bader
75a4ab20fd add python deps & playwright 2026-04-04 20:38:40 +02:00
Jan Bader
d343a48af1 Add flake 2026-04-04 20:31:37 +02:00
naki
c997e764b5 feat: Initial commit - Content Extractor for YouTube, Instagram, and blogs
- YouTube extraction with transcript support
- Instagram reel extraction via browser automation
- Blog/article web scraping
- Auto-save to Obsidian vaults
- Smart key point generation
- Configurable via .env file
- Quick extract shell script

Tech stack: Python, requests, beautifulsoup4, playwright, youtube-transcript-api
2026-03-05 13:02:58 +05:30