'Clean ChrysanthemumGarden' script in EpubEditor doesn't work
Describe the bug Gibberish in EPUB is not fixed after using the Clean ChrysanthemumGarden script in EpubEditor.
To Reproduce Steps to reproduce the behavior:
- Go to https://chrysanthemumgarden.com/novel-tl/cyse/
- Click on Pack EPUB in WebToEpub extension
- After EPUB is downloaded, go to https://dteviot.github.io/EpubEditor/ and upload the EPUB, then click on Clean ChrysanthemumGarden
- Open the EPUB that was modified
Expected behavior Usually the gibberish text would be fixed into normal words and sentences.
Screenshots
Example in Chapter 1 of the gibberish text in the modified file.
vs the original text on the website
Desktop (please complete the following information):
- OS: Windows
- Browser: Chrome
- Version 140.0.7339.80
Additional context I initially thought this was a problem for that specific novel only but I tested a few other novels on CG and the results were the same, where the gibberish text weren't fixed in EpubEditor.
I am having the same issue, it was working fine as of 9/8
@bookimp @lunevale
They've changed how the text is being scrambled. They're using multiple substitution cyphers (I'm not sure how many, but from simple inspection, I can see at least 6.) The text is unscrambled by the browser using custom fonts. And I think they are assigning the cyphers at random, as the same page downloaded twice is using different cyphers. And also inserting garbage chunks. (The garbage is at least easy to spot. It all has a hidden style.)
So, in answer to your question:
Is it possible to write a new script to fix the site?
It's a lot harder than last time, I don't know how many cyphers/fonts they are using. Figuring out each cypher is fairly simple (if tedious), but I'm not sure if they're generating them on the fly. In which case, it becomes much, much more work.
At this point, I'm not interested in investing the time trying to crack it.
Thanks for taking the time to take a look at the issue and providing an update!
Please leave the cleaner in the EpubEditor as I have found 1 single novel (I Have Medicine) it still works for.
@bookimp
Please leave the cleaner in the EpubEditor as
I was not planning on removing it
What type of help do you need @dteviot ? If it's something like matching letters and recording them down, I could probably help with that if it helps you not have to deal with the tedious stuff. Also is there a way to download the fonts they're using?
If the text gets unscrambled using custom fonts... is there a higher probability that the cyphers aren't being generated at random and they just have a set / finite number of cyphers? I'm assuming they're matching a different font for each cypher? Or is that not how it works? (Complete coding muggle here.)
I also came hoping that this issue would be fixable, is it really not possible to decipher...? This site has a lot of novels, so really hoping a fix will work in the future :'(
I've been trying to load a bunch of substitution cyphers by reloading the pages but it feels like they just change cyphers every couple of minutes? So far I have four different versions of one novel downloaded, each having its own cypher/ letter pattern. For some reason, one cypher pops up more often for me than the others. Out of the 8 epubs I saved (all just one novel), 4 share one cypher, and then there's another pattern that came up twice.
NRJ (The) - 1 cWS (The) - 4 gNt (The) - 2 bYt (The) -1
So far, the other two cyphers have not popped up again, so I feel more like they really are just cycling through a handful of cyphers and they aren't generating random cyphers on the fly? If we are able to map out the letter combos for you @dteviot , would it be possible to update the cleanup script? I also wouldn't mind if the solution is multiple buttons for each cypher, if that makes the process easier (going with the assumption that there's only around 5 or so cyphers). Maybe there could be instructions like, "If 'The' is spelt as 'cWs', press this button or use this cypher"?
edit: typos
edit 2: I have now saved 9 epubs of the same novel. The bYt pattern finally popped up again. Here's the new tally. I'll keep trying to save epubs and see if any load up using new cyphers (since you mentioned you clocked 6 cyphers). NRJ (The) - 1 cWS (The) - 4 gNt (The) - 2 bYt (The) -2
@blucat678
since you mentioned you clocked 6 cyphers
I'm pretty sure I saw 6 different font-family values, so assumed 6 different cyphers. (see later on). I think I only checked 5 of them to see if the cyphers themselves were different.
Anyway.
You can use the following steps to compute the decryption key for a cypher.
- Open a chapter on the site.
- Identify a string that is encrypted in the source, but decrypted on the screen. (Open DevTools, go to "Elements" tab and search for span[style^='font-family']. This should show a string that is encrypted.)
- Right click on the highlighted <span> element on Dev Tools and select Edit as HTML
- ADD the following text into the state of the encrypted text in the <span> element abc def ghi jkl mno pqr stu vwx yz ABC DEF GHI JKL MNO PQR STU VWX YZ
-
Click on another element on the elements tab.
-
Now go back to the normal view. You should see what looks like random text inserted into the normal page, just before the decrypted text. e.g. something like
-
Using this you now know that an encoded 'a' is shown on screen as a 'c', a 'b; as 'H' etc. So, you really just need to type the string that you see. This is the decryption key.
-
Repeat steps 2 though 7, selecting different encrypted strings to get the additional decryption keys. (Note, you might want to record the style value of the <span> for each encryption key. For a given page the style indicates the cypher being used. (i.e. All span's on the page with the same style use the same cypher key. Which makes sense as the style is indicating the font being used.
-
To decrypt, you need to take each encrypted string and figure out which cypher is being used. The following is what I used for a different site.
function decrypt(clear, selector) {
let crypt = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
let decryptTable = new Map();
for(let i = 0; i < crypt.length; ++i) {
decryptTable.set(crypt[i], clear[i]);
}
let decryptChar = (c) => decryptTable.get(c) ?? c;
let decryptString = (cypherText) => cypherText.split("").map(c => decryptChar(c)).join("");
for(let e of dom.querySelectorAll(selector)) {
console.log(e.textContent);
e.textContent = decryptString(e.textContent);
console.log(e.textContent);
}
}
decrypt("wcrmbatihvlxdngospykqeuzfjWCRMBATIHVLXDNGOSPYKQEUZFJ", "span[style*='OpenSans-1']");
decrypt("dznqohwfytcmpaerubklgvsxjiDZNQOHWFYTCMPAERUBKLGVSXJI", "span[style*='OpenSans-2']");
decrypt("iqydbeanljopmukrhztxsvfcwgIQYDBEANLJOPMUKRHZTXSVFCWG", "span[style*='OpenSans-3']");
return true;
Note that it's using the style to indicate the cypher that's being used. So, something to check is ChrysanthemumGarden always using the same font-family code in the style for a given cypher?
If so, then modifying the above script is trivial. If not (and I only glanced at it briefly, so I'm not sure) we'd need to figure out how to map style to cypher. (It's possible that each cypher only has a fixed set of styles. e.g. there are 4 styles that indicate it's the first cypher, another 4 for the 2nd etc.) But they may be doing something else. You'll need to look at the collected data and figure it out.
Hopefully that's enough for you to get started. Let me know if any of the above is unclear.
@dteviot i didn't have a look but could something like this work?
Too large to not hide it
- Use the test version to download your epub. Test versions for Firefox and Chrome have been uploaded to https://github.com/dteviot/WebToEpub/releases/tag/developer-build. Pick the one suitable for you, follow the "How to install from Source" instructions at https://github.com/dteviot/WebToEpub/tree/ExperimentalTabMode#user-content-installation and let me know how it goes.
- Extract your epub like you would a zip file (7-zip etc.) in a folder
- Open the folder with visual studio code
- Use the search in all files function with this regex search
requiem_tnr_[^()]![]()
- Save the found word in a new txt file and add the last characters in the empty () and append
|()![]()
- Repeat step 5 until you can't find new matches
![]()
- Create links from the words. Prepend
https://requiemtls.com/wp-content/themes/lightnovel/fonts/and append.ttf![]()
- Open these links in your browser it should open the download menu and save the files with their original name.
- Use calibre to edit you epub.
- File -> Import files into book
![]()
- Select all your downloaded fonts
![]()
- Open the file stylesheet.css and CTR+V and the end of the file
![]()
![]()
- Change
Change this to the relative path to: OEBPSto..![]()
- Change the font family from
Times New Romanto the font file name without.ttf![]()
- Save your epub.
- finished @dteviot i don't know if it is possible to add this functionality to epubeditor as cors are preventing the download of the fonts (i think)
Originally posted by @gamebeaker in #1925
@gamebeaker
In short, you're suggesting add the decryption fonts into the epub. In theory, that should work. In practice, I've never tried it. I'm not sure if the steps will work, the fonts I've seen looked like
<span style="font-family: rnlfJtfRCW;">
<span style="font-family: LPJMfkmHKG;">
@dteviot i will have a try at it later.
@dteviot lol I didn't realize each paragraph actually uses a different cypher. I thought it was just one cypher throughout the whole novel, since I was just looking at one paragraph! (Also, I did find a 5th cypher).
So far, I have these two cyphers:
cWS <span style="font-family: ZxXoTeIptL;">
abc def ghi jkl mno pqr stu vwx yz ABC DEF GHI JKL MNO PQR STU VWX YZ
qVT PNE AHb ykp xiY tlW dOz UGn sM cZX BQu SaR KIC Jwg FLD efr vhm jo
gNt <span style="font-family: ijqXQijeiD;">
abc def ghi jkl mno pqr stu vwx yz ABC DEF GHI JKL MNO PQR STU VWX YZ
Pwy UBV TYq AXx ZMf Ejr SeD azC kW oiv HJb Klt NdL Ohu pgI mQs cnF RG
I noticed that cWS got used twice in the same chapter and both times, it used the ZxXoTeIptL font family. Will continue to find more cyphers and update/edit here as I go.
edit 1 -- so far, all fonts and cyphers still come in pairs.
bYt <span style="font-family: WTKNOkuWha;">
abc def ghi jkl mno pqr stu vwx yz ABC DEF GHI JKL MNO PQR STU VWX YZ
dTK bCM wpk GWJ rJO UiF Ves PoX Rf QSm uvq glE yDB Lnz IYH AZc axt hN
iyG <span style="font-family: rnlfJtfRCW;"> (This was the 5th one I found while reloading pages lol)
abc def ghi jkl mno pqr stu vwx yz ABC DEF GHI JKL MNO PQR STU VWX YZ
Jzn CuU ZtT gKG Akv wBS OYL Hsi ha NEP pMV efW Roq lym bjc IXr dQD Fx
edit 2 -- These are the five font-cypher sets so far. I'll try checking more chapters in case there's another one.
NRJ <span style="font-family: LPJMfkmHKG;">
abc def ghi jkl mno pqr stu vwx yz ABC DEF GHI JKL MNO PQR STU VWX YZ
cHM ZtW Yfa Eip jXb RPL ogA FSB DV rOm UNx Ilk eCs zTu wKh dJn GqQ yv
@dteviot I tried replacing the values in the script you provided with the cypher and font combos and it works!!! Woohoo! Thanks so much!
@bonnetchuu @lunevale @bookimp @gamebeaker just copy-paste the script below into the space provided in Epub Editor then drag and drop your CG epub into the Drop Zone.
Lastly, click the [Run script above to modify Epub] button, then save!
function decrypt(clear, selector) {
let crypt = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
let decryptTable = new Map();
for(let i = 0; i < crypt.length; ++i) {
decryptTable.set(crypt[i], clear[i]);
}
let decryptChar = (c) => decryptTable.get(c) ?? c;
let decryptString = (cypherText) => cypherText.split("").map(c => decryptChar(c)).join("");
for(let e of dom.querySelectorAll(selector)) {
console.log(e.textContent);
e.textContent = decryptString(e.textContent);
console.log(e.textContent);
}
}
decrypt("qVTPNEAHbykpxiYtlWdOzUGnsMcZXBQuSaRKICJwgFLDefrvhmjo", "span[style*='ZxXoTeIptL']");
decrypt("PwyUBVTYqAXxZMfEjrSeDazCkWoivHJbKltNdLOhupgImQscnFRG", "span[style*='ijqXQijeiD']");
decrypt("dTKbCMwpkGWJrJOUiFVesPoXRfQSmuvqglEyDBLnzIYHAZcaxthN", "span[style*='WTKNOkuWha']");
decrypt("JznCuUZtTgKGAkvwBSOYLHsihaNEPpMVefWRoqlymbjcIXrdQDFx", "span[style*='rnlfJtfRCW']");
decrypt("cHMZtWYfaEipjXbRPLogAFSBDVrOmUNxIlkeCszTuwKhdJnGqQyv", "span[style*='LPJMfkmHKG']");
return true;
@blucat678 omg omg, thank you so so much!!!! 😭😭😭 you literally helped so many readers and came up with a solution really fast omg!! tho it was probably really difficult and time-consuming on your end, thank you so much for not giving up and kindly letting us use your hard-earned solution as well!! 🤯🥹 your hard work and many efforts won't be forgotten!! ❤️💕
Edit: LOL I was too stoked about the solution, but yes ofc thank you so much to @dteviot as always!! your solutions gave the foundation to the finish line, ty!! <3
The script works perfectly! Incredible work, thank you so much @blucat678 @dteviot! ❤
@bonnetchuu thanks for the kind words! You should thank @dteviot tho. The script was from them and I was mostly just following their instruction and doing the somewhat tedious grunt work (I'm a complete noob).
@dteviot & @blucat678 thank you so much!
Notes.
- @blucat678 Thank you for doing the tedious grunt work. Congratulations, you're now responsible for doing this again if/when ChrysanthemumGarden change the cyphers.
- If the script is working with 5 cyphers, then there's obviously only 5 cyphers. And my claim of 6 was in error. So, my apologies for the extra work. How did I make the mistake? Answer, I was sloppy. I think what happened was I checked one page and saw 3 different cyphers. Looked at a 2nd page and saw another 3 cyphers. Started checking to see if any cyphers were in common, but stopped after comparing the first two.
- Remaining tasks:
- Add the script to EpubEditor's scripts collection. https://github.com/dteviot/EpubEditor/tree/master/mutators
- Add the new logic to EpubEditor's for the ChrysanthemumGarden button (so people won't continue to complain)
- Update readme for EpubEditor
- Add instructions on how to solve the substitution cypher.
- Link to EpubEditor's scripts collection.
And I'm seeing more cyphers.
decrypt("iKhDSORsAbqBtGNYpecfHQEwklxJlWCmTLjFdzrPXuvVonMygUZa", "span[style*='PWJEddcfVv']");
decrypt("gjkChAdlBJYOVIxTXnisWLvmyEMtuGzPpaebFDcZoRHSwUrNfqKQ", "span[style*='ofcUGYMWCy']");
Plus: <span style="font-family: hffmcMyCbf;"> <span style="font-family: ktlmWRazmy;"> <span style="font-family: UxneBYgsjE;">
At this point. I'm stopping. Someone else can search for them.
Updated EpubEditor with list of known cyphers.
Notes: 59 minutes work.
I see there are now seven cyphers in total, but this may just be the beginning as they may be experimenting with adding more or just making attempts to slowly encrypt the entire chapter text. It doesn't seem all that possible to use variables here or softcode it so that new cyphers are automatically picked up in a universally generalised script, though that would be really handy since they've upped the security so much.
These are the other cyphers I found:
XMgbgIppHk PWJEddcfVv lqagMDCZsf
But because I can't distinguish their font between the lowercase "I" (letter L) and uppercase "I" (pronoun), these two letters get messed up when I decrypted the text. I don't quite know how to copy the encryption key directly from their site because it just keeps copying the source code. Though, I think it's highly unlikely that these cyphers will stop at seven and will continue to increase...
@yuyu-cloud
I can't distinguish their font between the lowercase "I" (letter L) and uppercase "I" (pronoun),
I have same problem. However, if you look at the decrypted text, with a different font, it's obvious where an uppercase "i" is being used instead of a lower case "L", and you can fix the key.
I don't quite know how to copy the encryption key directly from their site
You can't. When you're copying the text, you're copying the underlying scrambled text. It's when the characters are rendered to the screen by the font that they are "decrypted".
all that possible to use variables here or softcode it so that new cyphers are automatically picked up in a universally generalised script
Theoretically, as they're using simple substitution cyphers, you could use letter frequency analysis and word pattern matching to automatically figure out the key. But you generally need longer text snippets to do that.
@dteviot
Given the time-frame in which the cyphers have now switched up and changed completely, making a list of the previously known cyphers just seems to be exhaustive and time-consuming given the rate in which they've changed, so is there perhaps a possibility that they encoded some randomiser that automatically re-encrypts the scrambled text with new cyphers every 24 hours? If this was the case, since the cyphers are inconstant and variable, unless the script is automated in its configuration to account for all combinations of characters (including the font combos), then this seems impossible to crack via hardcoding as the script would be rendered ineffective within hours of it being decrypted.
Theoretically, as they're using simple substitution cyphers, you could use letter frequency analysis and word pattern matching to automatically figure out the key. But you generally need longer text snippets to do that.
I'm not quite sure what you mean by the longer text snippets? Is this referring to examples from the text on the source itself? And if this solution is perhaps feasible given the simplicity of the substitution cyphers, would it require much more effort to implement these possible methods of letter frequency analysis and pattern matching as you've stated into ePubEditor? Really sorry for the bother, but this source is just really huge and this sudden change-up after years of non-action was surprising ahah, so really hoping that there will come a permanent solution one day (soon)... 😭😭😭
P.S. thank you for the tip on differentiating those two letters as well! 😊 tho that also seems just as exhaustive to do repeatedly 🥲
@yuyu-cloud
u mean by the longer text snippets?
By "text snippets", I'm referring to the "scrambled" text. However, it only appears as "short" snippets. 4 or 5 sentences possibly 2 paragraphs per chapter. Because an automated analysis works from a statistical analysis, it needs a big enough sample (i.e. length of "scrambled" text) to get decent statistics from.
It might be possible to ingest an entire story (multiple chapters) to collect enough bulk to do an analysis. But that's kind of beyond the scope of EpubEditor. I would probably need to write a new tool for that.
P.S. thank you for the tip on differentiating those two letters as well! 😊 tho that also seems just as exhaustive to do repeatedly
I had a thought that might reduce the work a bit. Idea is
- When you write the key, instead of putting in an 'i' or 'L', put in '2' and '3' in the first key, then '4' and '5' in the next key, etc. (If you've got more keys, you could also use !, @, #, etc.
- When you run EpubEditor, open the console window before clicking on the "Clean" button. Epub Editor will write the scrambled and then the descrambled text to the console.
- So, by looking at the output text, it should be easy to see where 3, 4, etc are inserted into words. And from context it should be obvious. e.g. if you see "2eve2", it should be obvious that 2 is lower 'L'. And by elimination 3 must be upper case 'i'.
- You can repeat for the other numbers, characters.
- You can now fix all the keys in one hit.
Additional font-family mentioned in thread
<span style="font-family: hffmcMyCbf;">
abc def ghi jkl mno pqr stu vwx yz ABC DEF GHI JKL MNO PQR STU VWX YZ
FGq NYQ LTP UHe cEr xRu CjB kDX bM aKy fzO hJd ipo IAg Wlt ZVs nmS vw
<span style="font-family: ktlmWRazmy;">
abc def ghi jkl mno pqr stu vwx yz ABC DEF GHI JKL MNO PQR STU VWX YZ
upT Zvv jGa MwR BUX elq JAC QfF ky brE nmo Wcg HxY Pzt Ssh DOI dLi KN
<span style="font-family: UxneBYgsjE;">
abc def ghi jkl mno pqr stu vwx yz ABC DEF GHI JKL MNO PQR STU VWX YZ
RWO Vtg zYj NfX MPQ qsc dZK wrL IF BCe vhH SAE DIp noG Tuk iby xam JU
<span style="font-family: XMgbgIppHk;">
abc def ghi jkl mno pqr stu vwx yz ABC DEF GHI JKL MNO PQR STU VWX YZ
uZc Qtk AyR nJg xGV TbE XYw OBI Wh vmK qoP rjd ceH NDp Uzf SFM ais IL
<span style="font-family: lqagMDCZsf;">
abc def ghi jkl mno pqr stu vwx yz ABC DEF GHI JKL MNO PQR STU VWX YZ
inD FJI bUa cwv HOI dxu shA oLV MZ CSe YjP Xkz NtQ Rfy qTr pWG gmE BK
Added script format
decrypt("FGqNYQLTPUHecErxRuCjBkDXbMaKyfzOhJdipolAgWItZVsnmSvw", "span[style*='hffmcMyCbf']");
decrypt("upTZvvjGaMwRBUXelqJACQfFkybrEnmoWcgHxYPztSshDOIdLiKN", "span[style*='ktlmWRazmy']");
decrypt("RWOVtgzYjNfXMPQqscdZKwrLlFBCevhHSAEDIpnoGTukibyxamJU", "span[style*='UxneBYgsjE']");
decrypt("uZcQtkAyRnJgxGVTbEXYwOBlWhvmKqoPrjdceHNDpUzfSFMaisIL", "span[style*='XMgbgIppHk']");
decrypt("inDFJlbUacwvHOIdxushAoLVMZCSeYjPXkzNtQRfyqTrpWGgmEBK", "span[style*='lqagMDCZsf']");
I was poking around and saw that the font is saved as woff2 file eg
https://chrysanthemumgarden.com/wp-content/plugins/chrys-garden-plugin/resources/fonts/used/PWJEddcfVv.woff2
So I just change the woff2 name to get the others and use below viewer
https://products.aspose.app/font/viewer/woff2
Still have to copy the output manually though.
Alternative woff2 viewer are but their output are the CAPITAL letters first
https://opentype.js.org/glyph-inspector.html
https://hellogreg.github.io/glytter/
Hope it helps
@ghus3rz
Thank you very much for that. Note, I fixed up some "i' vs "L" errors in your cyphers. I've added the new cyphers to EpubEditor, and the sample script. Sample script can be found at https://github.com/dteviot/EpubEditor/blob/master/mutators/chrysanthemumgarden-UnscrambleTextV2.js (so I don't have to keep updating it here.) I've also tweaked the EpubEditor logging, so it now gives the cypher name, and doesn't bother writing the cypher text. Makes it much easier to inspect the output.
Thank you ... I checked for small "i" but still missed the upper "i' vs lower "L" earlier
@dteviot i got it working with the fonts. Do you think it is a Tos violation if we just use the original fonts to display the content? I left the last step of adding the files in stylesheet.css to the epubeditor. I don't think it would be that hard to add it in W2E.
I don't think that it is because we don't decrypt anything it is just displayed with the original font. #2193 and https://github.com/dteviot/EpubEditor/pull/12
hello, I've been a longtime lurker and created an account specifically to contribute to this CG business. I just so happened to be working independently on the same novel displayed in dteviot's screenshot:
I'm commenting because in following the above directions to identify cyphers, I got different cyphers from the ones captured in the screenshot and in fact running decrypt script with the current 'sample script' did not unscramble the text. I was able to get 1 cypher of my own working, but ran into the issue of i/L.
I've been working in Calibre and HTML for a long time and I have the time to learn if someone is willing to teach me--I am more than willing to spend the hours necessary to dig up cyphers, I just want to know why mine aren't working and why the cyphers are changing.
Not sure how collaboration here works but I would really love to help! And want to share my own tools and resources too.