wikiteam icon indicating copy to clipboard operation
wikiteam copied to clipboard

Is there a way to dowload a mediawiki installation dump from archive.org?

Open trenkert opened this issue 1 year ago • 1 comments

There are wikis preserved on archive.org which are now longer accessible on their original servers. Is there any way to download those wikis (mainly mediawiki installations) in full to import them into a fresh mediawiki installation and run them again locally?

trenkert avatar Jun 29 '24 18:06 trenkert

Il 29/06/24 21:03, Thomas Renkert ha scritto:

There are wikis preserved on archive.org which are now longer accessible on their original servers.

Yes, thousands of them.

Is there any way to download those wikis (mainly mediawiki installations) in full to import them into a fresh mediawiki installation and run them again locally?

Yes, just click the relevant download button on the sidebar or click "show all" and then copy the download URL for use with your preferred download manager (like wget).

Then see https://www.mediawiki.org/wiki/Manual:Importing_XML_dumps

nemobis avatar Jun 30 '24 11:06 nemobis

thank you, I did not mean archive.org as in archived wiki xmls dumps, but the waybackmachine with the captured pages of a wiki. The xml dump does not exist on archive.org, but the waybackmachine has the pages captured. Is it possible to reconstruct an xml dump from pages captured on wayback?

trenkert avatar Jun 30 '24 16:06 trenkert

Not really. You'll need an HTML crawler customised for MediaWiki purposes and then a script to convert the HTML back to wikitext. There are some such partial solutions in https://www.mediawiki.org/wiki/Category:Import/Export . History can't be realistically produced.

If the wiki is less than a thousand pages big, it's probably easier to copy and paste pages one by one with the VisualEditor.

nemobis avatar Jun 30 '24 17:06 nemobis

thanks!

trenkert avatar Jun 30 '24 18:06 trenkert