Derp. Fix missing constant post rename
This commit is contained in:
119
rss.xml
119
rss.xml
@@ -6,6 +6,125 @@
|
||||
<link>https://konsthol.eu/rss.xml</link>
|
||||
<atom:link href="https://konsthol.eu/rss.xml" rel="self" type="application/rss+xml"/>
|
||||
|
||||
<item>
|
||||
<title>Simple way to extend yt-dlp</title>
|
||||
<link>https://konsthol.eu/log/simple_way_to_extend_yt_dlp-12-01-2025.html</link>
|
||||
<pubDate>Sun, 12 Jan 2025</pubDate>
|
||||
<description><![CDATA[<blockquote>
|
||||
<p>DATE: Sun 12 Jan 2025 15:51 By: konsthol@pm.me</p>
|
||||
</blockquote>
|
||||
<h1 id="simple-way-to-extend-yt-dlp">Simple way to extend yt-dlp</h1>
|
||||
<p>
|
||||
Lots of people use yt-dlp either directly or indirectly through mpv. It’s a
|
||||
powerful tool that acts as a website scraper and it supports thousands of
|
||||
websites. The website its mostly used for is like the name suggests YouTube.
|
||||
Now, YouTube is a great resource but usage through the website is quite
|
||||
unpleasant so lots of people opt out to use alternative frontends like
|
||||
Invidious or Piped. Lots of times you just want to use mpv to stream a YouTube
|
||||
video by providing the link like:
|
||||
</p>
|
||||
<blockquote>
|
||||
<p>mpv https://youtube.com/watch?v=[VideoID]</p>
|
||||
</blockquote>
|
||||
<p>
|
||||
That works like a charm, but what happens when you provide a link of an
|
||||
alternative frontend? Well, it translates it to the aforementioned format in
|
||||
order to work. But there are so many instances of Invidious and Piped, so how
|
||||
does it know what to do? That was my question as well since I use a self
|
||||
hosted Piped instance and it does not recognize the domain. Obviously.
|
||||
</p>
|
||||
<p>
|
||||
Thankfully, yt-dlp is an open source project so you can actually see what goes
|
||||
on behind the scenes. In my case, I installed it with the Arch Linux package
|
||||
manager and it resides at:
|
||||
</p>
|
||||
<blockquote>
|
||||
<p>/usr/lib/python3.13/site-packages/yt_dlp/</p>
|
||||
</blockquote>
|
||||
<p>
|
||||
The way yt-dlp works is that it has a folder called “extractor” in that path
|
||||
and in that folder there is a python file for each supported website. In
|
||||
YouTube’s case it’s youtube.py. I opened it and I saw this:
|
||||
</p>
|
||||
<pre><code>class YoutubeBaseInfoExtractor(InfoExtractor):
|
||||
"""Provide base functions for Youtube extractors"""
|
||||
|
||||
_RESERVED_NAMES = (
|
||||
r'channel|c|user|playlist|watch|w|v|embed|e|live|watch_popup|clip|'
|
||||
r'shorts|movies|results|search|shared|hashtag|trending|explore|feed|feeds|'
|
||||
r'browse|oembed|get_video_info|iframe_api|s/player|source|'
|
||||
r'storefront|oops|index|account|t/terms|about|upload|signin|logout')
|
||||
|
||||
_PLAYLIST_ID_RE = r'(?:(?:PL|LL|EC|UU|FL|RD|UL|TL|PU|OLAK5uy_)[0-9A-Za-z-_]{10,}|RDMM|WL|LL|LM)'
|
||||
|
||||
# _NETRC_MACHINE = 'youtube'
|
||||
|
||||
# If True it will raise an error if no login info is provided
|
||||
_LOGIN_REQUIRED = False
|
||||
|
||||
_INVIDIOUS_SITES = (
|
||||
# invidious-redirect websites
|
||||
r'(?:www\.)?redirect\.invidious\.io',
|
||||
r'(?:(?:www|dev)\.)?invidio\.us',
|
||||
# Invidious instances taken from https://github.com/iv-org/documentation/blob/master/docs/instances.md
|
||||
r'(?:www\.)?invidious\.pussthecat\.org',
|
||||
r'(?:www\.)?invidious\.zee\.li',
|
||||
[more instances here]
|
||||
)</code></pre>
|
||||
<p>
|
||||
There is a class called YoutubeBaseInfoExtractor that has an array of
|
||||
instances called _INVIDIOUS_SITES that uses a regex to catch every domain
|
||||
there. Now, I saw at the GitHub page of yt-dlp a lot of people asking the
|
||||
maintainers to add more instances on this list. Theoretically you also can
|
||||
just edit the file and add a domain so that it recognizes that one too. But,
|
||||
in my personal opinion it’s never a good idea to edit upstream files because
|
||||
as the program updates your changes will be overwritten. So I found another
|
||||
way to deal with this.
|
||||
</p>
|
||||
<p>
|
||||
You see, yt-dlp is not just a command line utility. You can use it as a
|
||||
library to make your own extractors for websites. The way you do that is by
|
||||
creating your own plugins. In my case, I didn’t actually want to make a new
|
||||
extractor but somehow extend an array of an already existing one. Not all
|
||||
extractors use this method but since YouTube does, it would work. So I made
|
||||
this file at this location:
|
||||
</p>
|
||||
<blockquote>
|
||||
<p>~/.config/yt-dlp/plugins/piped/yt_dlp_plugins/extractor/piped.py</p>
|
||||
</blockquote>
|
||||
<p>The contents are simple:</p>
|
||||
<pre><code>from yt_dlp.extractor.youtube import YoutubeBaseInfoExtractor, YoutubeIE
|
||||
|
||||
class CustomYoutubeBaseInfoExtractor(YoutubeBaseInfoExtractor):
|
||||
_INVIDIOUS_SITES = YoutubeBaseInfoExtractor._INVIDIOUS_SITES + (
|
||||
r'(?:www\.)?piped\.konsthol\.eu',
|
||||
)
|
||||
|
||||
class PipedKonstholYoutubeIE(YoutubeIE, CustomYoutubeBaseInfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?piped\.konsthol\.eu/watch\?v=(?P<id>[0-9A-Za-z_-]{11})'
|
||||
IE_NAME = 'piped.konsthol.eu'
|
||||
</code></pre>
|
||||
<p>
|
||||
We import the class that contains the array we need and the youtube extractor.
|
||||
We make a new class in which we provide the one that has the array. We access
|
||||
the array and add a new regex for our domain. Then we make a new class for the
|
||||
extractor, provide the one we just created and the YouTube extractor class and
|
||||
we tell it to work for urls that look like the one we provided. In that way,
|
||||
this pseudo extractor is being activated when we provide a url that looks like
|
||||
this, it extends the actual YouTube extractor and activates that one, only
|
||||
this time it works for our domain too.
|
||||
</p>
|
||||
<p>
|
||||
It’s amazing what you can do with open source software just by observing how a
|
||||
program works. Now every time someone needs a new domain for an alternative
|
||||
YouTube frontend added, instead of asking the developers to do that, using
|
||||
this simple solution he/she can just add it to the plugin.
|
||||
</p>
|
||||
<p><a href="https://github.com/yt-dlp/yt-dlp/">yt-dlp GitHub page</a><br /></p>
|
||||
<p><a href="..">..</a></p>]]></description>
|
||||
</item>
|
||||
|
||||
|
||||
<item>
|
||||
<title>The magic of Wake-On-LAN</title>
|
||||
<link>https://konsthol.eu/log/the_magic_of_wake_on_lan-19-12-2024.html</link>
|
||||
|
||||
Reference in New Issue
Block a user