Skip to content

Commit e474810

Browse files
committed
Deployed d051193 with MkDocs version: 1.6.1
1 parent 9ffb5ac commit e474810

File tree

7 files changed

+578
-212
lines changed

7 files changed

+578
-212
lines changed

Fetching XML data/Special Exports/index.html

Lines changed: 64 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@
7474
<div data-md-component="skip">
7575

7676

77-
<a href="#using-the-special-export-tool" class="md-skip">
77+
<a href="#importing-packages" class="md-skip">
7878
Skip to content
7979
</a>
8080

@@ -389,33 +389,27 @@
389389
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
390390

391391
<li class="md-nav__item">
392-
<a href="#using-the-special-export-tool" class="md-nav__link">
392+
<a href="#importing-packages" class="md-nav__link">
393393
<span class="md-ellipsis">
394-
Using the Special Export Tool
394+
Importing Packages
395395
</span>
396396
</a>
397397

398-
<nav class="md-nav" aria-label="Using the Special Export Tool">
399-
<ul class="md-nav__list">
400-
401-
<li class="md-nav__item">
402-
<a href="#examples" class="md-nav__link">
398+
</li>
399+
400+
<li class="md-nav__item">
401+
<a href="#using-the-special-export-tool" class="md-nav__link">
403402
<span class="md-ellipsis">
404-
Examples
403+
Using the Special Export Tool
405404
</span>
406405
</a>
407406

408-
</li>
409-
410-
</ul>
411-
</nav>
412-
413407
</li>
414408

415409
<li class="md-nav__item">
416-
<a href="#using-the-requests-library" class="md-nav__link">
410+
<a href="#fetching-xml-data-with-requests" class="md-nav__link">
417411
<span class="md-ellipsis">
418-
Using the requests Library
412+
Fetching XML Data with requests
419413
</span>
420414
</a>
421415

@@ -723,33 +717,27 @@
723717
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
724718

725719
<li class="md-nav__item">
726-
<a href="#using-the-special-export-tool" class="md-nav__link">
720+
<a href="#importing-packages" class="md-nav__link">
727721
<span class="md-ellipsis">
728-
Using the Special Export Tool
722+
Importing Packages
729723
</span>
730724
</a>
731725

732-
<nav class="md-nav" aria-label="Using the Special Export Tool">
733-
<ul class="md-nav__list">
734-
735-
<li class="md-nav__item">
736-
<a href="#examples" class="md-nav__link">
726+
</li>
727+
728+
<li class="md-nav__item">
729+
<a href="#using-the-special-export-tool" class="md-nav__link">
737730
<span class="md-ellipsis">
738-
Examples
731+
Using the Special Export Tool
739732
</span>
740733
</a>
741734

742-
</li>
743-
744-
</ul>
745-
</nav>
746-
747735
</li>
748736

749737
<li class="md-nav__item">
750-
<a href="#using-the-requests-library" class="md-nav__link">
738+
<a href="#fetching-xml-data-with-requests" class="md-nav__link">
751739
<span class="md-ellipsis">
752-
Using the requests Library
740+
Fetching XML Data with requests
753741
</span>
754742
</a>
755743

@@ -777,18 +765,21 @@ <h1>Special Export tool</h1>
777765

778766
<p>The <strong>Special Export</strong> tool fetches specific pages with their raw content (<em>wikitext</em>) in real-time, without needing to download the entire dataset. The content is provided in XML format.</p>
779767
<div class="toc"><span class="toctitle">On this page</span><ul>
780-
<li><a href="#using-the-special-export-tool">Using the Special Export Tool</a><ul>
781-
<li><a href="#examples">Examples</a></li>
782-
</ul>
783-
</li>
784-
<li><a href="#using-the-requests-library">Using the requests Library</a></li>
768+
<li><a href="#importing-packages">Importing Packages</a></li>
769+
<li><a href="#using-the-special-export-tool">Using the Special Export Tool</a></li>
770+
<li><a href="#fetching-xml-data-with-requests">Fetching XML Data with requests</a></li>
785771
</ul>
786772
</div>
773+
<h2 id="importing-packages">Importing Packages<a class="headerlink" href="#importing-packages" title="Permanent link">&para;</a></h2>
774+
<div class="language-python highlight"><table class="highlighttable"><tr><th colspan="2" class="filename"><span class="filename">Python</span></th></tr><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="kn">import</span> <span class="nn">requests</span> <span class="c1"># to fetch info from URLs</span>
775+
</code></pre></div></td></tr></table></div>
787776
<h2 id="using-the-special-export-tool">Using the <strong>Special Export</strong> Tool<a class="headerlink" href="#using-the-special-export-tool" title="Permanent link">&para;</a></h2>
788777
<p>You can actually use <strong>Special:Export</strong> to retrieve pages from <em>any</em> Wiki site. On the German Wiktionary, however, the tool is labelled <strong>Spezial:Exportieren</strong>, but it works the same way.</p>
789-
<h3 id="examples">Examples<a class="headerlink" href="#examples" title="Permanent link">&para;</a></h3>
790778
<p><strong>Exporting Pages from Any Wiki Site</strong></p>
791779
<p>To access the XML content of the page titled "Austria" from English Wikipedia, you can use the following Python code. When you press <code>run</code>, it will open the export link in your default browser:</p>
780+
<div class="tabbed-set tabbed-alternate" data-tabs="1:2"><input checked="checked" id="exec-2--__tabbed_1_1" name="exec-2--__tabbed_1" type="radio" /><input id="exec-2--__tabbed_1_2" name="exec-2--__tabbed_1" type="radio" /><div class="tabbed-labels"><label for="exec-2--__tabbed_1_1">Source</label><label for="exec-2--__tabbed_1_2">Result</label></div>
781+
<div class="tabbed-content">
782+
<div class="tabbed-block">
792783
<div class="language-python highlight"><table class="highlighttable"><tr><th colspan="2" class="filename"><span class="filename">Python</span></th></tr><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
793784
<span class="normal">2</span>
794785
<span class="normal">3</span>
@@ -797,21 +788,34 @@ <h3 id="examples">Examples<a class="headerlink" href="#examples" title="Permanen
797788
<span class="n">url</span> <span class="o">=</span> <span class="sa">f</span><span class="s1">&#39;https://</span><span class="si">{</span><span class="n">domain</span><span class="si">}</span><span class="s1">/wiki/Special:Export/</span><span class="si">{</span><span class="n">title</span><span class="si">}</span><span class="s1">&#39;</span>
798789
<span class="nb">print</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
799790
</code></pre></div></td></tr></table></div>
791+
</div>
792+
<div class="tabbed-block">
800793
<div class="language-pycon highlight"><table class="highlighttable"><tr><th colspan="2" class="filename"><span class="filename">Python Console Session</span></th></tr><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="go">https://en.wikipedia.org/wiki/Special:Export/Austria</span>
801794
</code></pre></div></td></tr></table></div>
795+
</div>
796+
</div>
797+
</div>
802798
<p><strong>Exporting Pages from the German Wiktionary</strong></p>
803-
<p>For the German Wiktionary, the export tool uses <code>Spezial:Exportieren</code> instead of <code>Special:Export</code>. You can use similar Python code to open the export link for the page titled "schön" (German for "beautiful"):</p>
799+
<p>For the German Wiktionary, the export tool uses <code>Spezial:Exportieren</code> instead of <code>Special:Export</code>. You can use similar Python code to open the export link for the page titled "hoch" (German for "high"):</p>
800+
<div class="tabbed-set tabbed-alternate" data-tabs="1:2"><input checked="checked" id="exec-3--__tabbed_1_1" name="exec-3--__tabbed_1" type="radio" /><input id="exec-3--__tabbed_1_2" name="exec-3--__tabbed_1" type="radio" /><div class="tabbed-labels"><label for="exec-3--__tabbed_1_1">Source</label><label for="exec-3--__tabbed_1_2">Result</label></div>
801+
<div class="tabbed-content">
802+
<div class="tabbed-block">
804803
<div class="language-python highlight"><table class="highlighttable"><tr><th colspan="2" class="filename"><span class="filename">Python</span></th></tr><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
805804
<span class="normal">2</span>
806805
<span class="normal">3</span>
807-
<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="n">title</span> <span class="o">=</span> <span class="s1">&#39;schön&#39;</span>
806+
<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="n">title</span> <span class="o">=</span> <span class="s1">&#39;hoch&#39;</span>
808807
<span class="n">domain</span> <span class="o">=</span> <span class="s1">&#39;de.wiktionary.org&#39;</span>
809808
<span class="n">url</span> <span class="o">=</span> <span class="sa">f</span><span class="s1">&#39;https://</span><span class="si">{</span><span class="n">domain</span><span class="si">}</span><span class="s1">/wiki/Spezial:Exportieren/</span><span class="si">{</span><span class="n">title</span><span class="si">}</span><span class="s1">&#39;</span>
810809
<span class="nb">print</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
811810
</code></pre></div></td></tr></table></div>
812-
<div class="language-pycon highlight"><table class="highlighttable"><tr><th colspan="2" class="filename"><span class="filename">Python Console Session</span></th></tr><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="go">https://de.wiktionary.org/wiki/Spezial:Exportieren/schön</span>
811+
</div>
812+
<div class="tabbed-block">
813+
<div class="language-pycon highlight"><table class="highlighttable"><tr><th colspan="2" class="filename"><span class="filename">Python Console Session</span></th></tr><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="go">https://de.wiktionary.org/wiki/Spezial:Exportieren/hoch</span>
813814
</code></pre></div></td></tr></table></div>
814-
<h2 id="using-the-requests-library">Using the <code>requests</code> Library<a class="headerlink" href="#using-the-requests-library" title="Permanent link">&para;</a></h2>
815+
</div>
816+
</div>
817+
</div>
818+
<h2 id="fetching-xml-data-with-requests">Fetching XML Data with <code>requests</code><a class="headerlink" href="#fetching-xml-data-with-requests" title="Permanent link">&para;</a></h2>
815819
<p>To programmatically fetch and download XML content, you can use Python's <code>requests</code> library. This example shows how to build the URL, make a request, and get the XML content of a Wiktionary page by its title.</p>
816820
<div class="language-python highlight"><table class="highlighttable"><tr><th colspan="2" class="filename"><span class="filename">Python</span></th></tr><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal"> 1</span>
817821
<span class="normal"> 2</span>
@@ -824,11 +828,7 @@ <h2 id="using-the-requests-library">Using the <code>requests</code> Library<a cl
824828
<span class="normal"> 9</span>
825829
<span class="normal">10</span>
826830
<span class="normal">11</span>
827-
<span class="normal">12</span>
828-
<span class="normal">13</span>
829-
<span class="normal">14</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="kn">import</span> <span class="nn">requests</span>
830-
831-
<span class="k">def</span> <span class="nf">fetch</span><span class="p">(</span><span class="n">title</span><span class="p">):</span>
831+
<span class="normal">12</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="k">def</span> <span class="nf">fetch</span><span class="p">(</span><span class="n">title</span><span class="p">):</span>
832832
<span class="c1"># Construct the URL for the XML export of the given page title</span>
833833
<span class="n">url</span> <span class="o">=</span> <span class="sa">f</span><span class="s1">&#39;https://de.wiktionary.org/wiki/Spezial:Exportieren/</span><span class="si">{</span><span class="n">title</span><span class="si">}</span><span class="s1">&#39;</span>
834834

@@ -839,10 +839,10 @@ <h2 id="using-the-requests-library">Using the <code>requests</code> Library<a cl
839839
<span class="n">resp</span><span class="o">.</span><span class="n">raise_for_status</span><span class="p">()</span>
840840

841841
<span class="c1"># Return the XML content of the requested page</span>
842-
<span class="k">return</span> <span class="n">resp</span><span class="o">.</span><span class="n">content</span>
842+
<span class="k">return</span> <span class="n">resp</span><span class="o">.</span><span class="n">text</span>
843843
</code></pre></div></td></tr></table></div>
844-
<p>Next, let us attempt to retrieve the XML content for the page titled "hoch" and print the initial 500 bytes for a glimpse of the XML content displayed in the <code>Result</code> tab.</p>
845-
<div class="tabbed-set tabbed-alternate" data-tabs="1:2"><input checked="checked" id="exec-4--__tabbed_1_1" name="exec-4--__tabbed_1" type="radio" /><input id="exec-4--__tabbed_1_2" name="exec-4--__tabbed_1" type="radio" /><div class="tabbed-labels"><label for="exec-4--__tabbed_1_1">Source</label><label for="exec-4--__tabbed_1_2">Result</label></div>
844+
<p>Next, let us attempt to retrieve the XML content for the page titled "hoch" and print the initial 500 bytes for a glimpse of the XML content.</p>
845+
<div class="tabbed-set tabbed-alternate" data-tabs="1:2"><input checked="checked" id="exec-5--__tabbed_1_1" name="exec-5--__tabbed_1" type="radio" /><input id="exec-5--__tabbed_1_2" name="exec-5--__tabbed_1" type="radio" /><div class="tabbed-labels"><label for="exec-5--__tabbed_1_1">Source</label><label for="exec-5--__tabbed_1_2">Result</label></div>
846846
<div class="tabbed-content">
847847
<div class="tabbed-block">
848848
<div class="language-python highlight"><table class="highlighttable"><tr><th colspan="2" class="filename"><span class="filename">Python</span></th></tr><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
@@ -851,7 +851,21 @@ <h2 id="using-the-requests-library">Using the <code>requests</code> Library<a cl
851851
</code></pre></div></td></tr></table></div>
852852
</div>
853853
<div class="tabbed-block">
854-
<div class="language-pycon highlight"><table class="highlighttable"><tr><th colspan="2" class="filename"><span class="filename">Python Console Session</span></th></tr><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="go">b&#39;&lt;mediawiki xmlns=&quot;http://www.mediawiki.org/xml/export-0.11/&quot; xmlns:xsi=&quot;http://www.w3.org/2001/XMLSchema-instance&quot; xsi:schemaLocation=&quot;http://www.mediawiki.org/xml/export-0.11/ http://www.mediawiki.org/xml/export-0.11.xsd&quot; version=&quot;0.11&quot; xml:lang=&quot;de&quot;&gt;\n &lt;siteinfo&gt;\n &lt;sitename&gt;Wiktionary&lt;/sitename&gt;\n &lt;dbname&gt;dewiktionary&lt;/dbname&gt;\n &lt;base&gt;https://de.wiktionary.org/wiki/Wiktionary:Hauptseite&lt;/base&gt;\n &lt;generator&gt;MediaWiki 1.44.0-wmf.16&lt;/generator&gt;\n &lt;case&gt;case-sensitive&lt;/case&gt;\n &lt;namesp&#39;</span>
854+
<div class="language-pycon highlight"><table class="highlighttable"><tr><th colspan="2" class="filename"><span class="filename">Python Console Session</span></th></tr><tr><td class="linenos"><div class="linenodiv"><pre><span></span><span class="normal">1</span>
855+
<span class="normal">2</span>
856+
<span class="normal">3</span>
857+
<span class="normal">4</span>
858+
<span class="normal">5</span>
859+
<span class="normal">6</span>
860+
<span class="normal">7</span>
861+
<span class="normal">8</span></pre></div></td><td class="code"><div><pre><span></span><code><span class="go">&lt;mediawiki xmlns=&quot;http://www.mediawiki.org/xml/export-0.11/&quot; xmlns:xsi=&quot;http://www.w3.org/2001/XMLSchema-instance&quot; xsi:schemaLocation=&quot;http://www.mediawiki.org/xml/export-0.11/ http://www.mediawiki.org/xml/export-0.11.xsd&quot; version=&quot;0.11&quot; xml:lang=&quot;de&quot;&gt;</span>
862+
<span class="go"> &lt;siteinfo&gt;</span>
863+
<span class="go"> &lt;sitename&gt;Wiktionary&lt;/sitename&gt;</span>
864+
<span class="go"> &lt;dbname&gt;dewiktionary&lt;/dbname&gt;</span>
865+
<span class="go"> &lt;base&gt;https://de.wiktionary.org/wiki/Wiktionary:Hauptseite&lt;/base&gt;</span>
866+
<span class="go"> &lt;generator&gt;MediaWiki 1.44.0-wmf.17&lt;/generator&gt;</span>
867+
<span class="go"> &lt;case&gt;case-sensitive&lt;/case&gt;</span>
868+
<span class="go"> &lt;namesp</span>
855869
</code></pre></div></td></tr></table></div>
856870
</div>
857871
</div>

0 commit comments

Comments
 (0)