{"id":736,"date":"2014-11-07T11:04:33","date_gmt":"2014-11-07T11:04:33","guid":{"rendered":"https:\/\/www.taywa.ch\/blog\/?p=736"},"modified":"2018-01-24T14:09:47","modified_gmt":"2018-01-24T14:09:47","slug":"mit-wget-komplette-website-kopieren-als-mirror-oder-offline-archiv","status":"publish","type":"post","link":"https:\/\/www.taywa.ch\/blog\/ubuntu\/mit-wget-komplette-website-kopieren-als-mirror-oder-offline-archiv\/","title":{"rendered":"Mit wget in der shell komplette Website kopieren als mirror oder offline-archiv"},"content":{"rendered":"<ul>\n<li>&#8211;mirror<br \/>\nmacht wget rekursiv, fasst dolgende optionen zusammen \u2018-r -l inf -N\u2019<\/li>\n<li>&#8211;adjust-extension<br \/>\nwenn aus .php .html werden soll und get-parameter in dateinamen kommen sollen<\/li>\n<li>&#8211;convert-links<br \/>\nkonvertiert links zu relativen pfaden<\/li>\n<li>&#8211;page-requisites<br \/>\nl\u00e4dt die ben\u00f6tigten bilder, pdf&#8217;s sowie css und js herunter<\/li>\n<li>-e robots=off<br \/>\nignoriert die robots.txt datei, damit der mirror wirklich vollst\u00e4ndig ist<\/li>\n<\/ul>\n<div class=\"bash dean_ch\"><span class=\"kw2\">wget<\/span> <span class=\"re5\">&#8211;mirror<\/span> <span class=\"re5\">&#8211;adjust-extension<\/span> <span class=\"re5\">&#8211;convert-links<\/span> <span class=\"re5\">&#8211;page-requisites<\/span> <span class=\"re5\">-e<\/span> <span class=\"re2\">robots<\/span>=off http:<span class=\"sy0\">\/\/<\/span>www.example.com<\/p>\n<p><span class=\"co0\"># oder kurz:<\/span><\/p>\n<p><span class=\"kw2\">wget<\/span> <span class=\"re5\">-m<\/span> <span class=\"re5\">-E<\/span> <span class=\"re5\">-k<\/span> <span class=\"re5\">-p<\/span> <span class=\"re5\">-e<\/span> <span class=\"re2\">robots<\/span>=off http:<span class=\"sy0\">\/\/<\/span>www.example.com<br \/>\n&nbsp;<\/div>\n<p>forciert mit httrack, und robots.txt \u00fcbergangen (s0):<\/p>\n<div class=\"bash dean_ch\">httrack <span class=\"re5\">&#8211;disable-security-limits<\/span> <span class=\"re5\">&#8211;max-rate<\/span> <span class=\"nu0\">300000000<\/span> <span class=\"re5\">-s0<\/span> http:<span class=\"sy0\">\/\/<\/span>www.example.com <span class=\"re5\">-v<\/span><br \/>\n&nbsp;<\/div>\n","protected":false},"excerpt":{"rendered":"<p>&#8211;mirror macht wget rekursiv, fasst dolgende optionen zusammen \u2018-r -l inf -N\u2019 &#8211;adjust-extension wenn aus .php .html werden soll und get-parameter in dateinamen kommen sollen &#8211;convert-links konvertiert links zu relativen pfaden &#8211;page-requisites l\u00e4dt die ben\u00f6tigten bilder, pdf&#8217;s sowie css und js herunter -e robots=off ignoriert die robots.txt datei, damit der mirror wirklich vollst\u00e4ndig ist wget<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21],"tags":[35,56],"class_list":["post-736","post","type-post","status-publish","format-standard","hentry","category-ubuntu","tag-baseurl","tag-shell"],"_links":{"self":[{"href":"https:\/\/www.taywa.ch\/blog\/wp-json\/wp\/v2\/posts\/736","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.taywa.ch\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.taywa.ch\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.taywa.ch\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.taywa.ch\/blog\/wp-json\/wp\/v2\/comments?post=736"}],"version-history":[{"count":12,"href":"https:\/\/www.taywa.ch\/blog\/wp-json\/wp\/v2\/posts\/736\/revisions"}],"predecessor-version":[{"id":1257,"href":"https:\/\/www.taywa.ch\/blog\/wp-json\/wp\/v2\/posts\/736\/revisions\/1257"}],"wp:attachment":[{"href":"https:\/\/www.taywa.ch\/blog\/wp-json\/wp\/v2\/media?parent=736"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.taywa.ch\/blog\/wp-json\/wp\/v2\/categories?post=736"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.taywa.ch\/blog\/wp-json\/wp\/v2\/tags?post=736"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}