[Feature] Add Np4k helper
What this Pull Request (PR) does
This PR adds a new Fabric helper that scrapes a URL or list of URLs to extract the article text and pipe it to Fabric. For example, if you have a list of articles about a given topic, you can save them to a file and run the np4k helper against the list to extract the article's text and feed it to relevant fabric patterns. Like, summarize or extract_article_wisdom.
Related issues
Net new feature.
Considerations
Newspaper4k is a fork of the fantastic but no longer being updated Newskaper3k library. I hoped this dependency could be manually installed separately if/when someone wanted to use this helper. However, I added it to the repo as a core dependency to get it working with Pipx. Maybe there is a better way to do that, but I couldn't figure it out.
Right now, it only pipes the article text to STDOUT. However, np4k produces much more data (including an NLTK-derived summary) for a given article. I added an --output flag that writes all this data to a file locally so the total value of np4k can still be leveraged without sending unnecessary data to the AI. Future patterns could be developed that explicitly use more of this data (like author name, published date, etc).
I am happy to make any/all adjustments as needed. Hopefully, this will be seen as a valuable new helper.
Screenshots/Examples
ZMBPM1➜ fabric : np4k_helper ✘ :✭ ᐅ np4k --url 'https://thehackernews.com/2024/03/github-launches-ai-powered-autofix-tool.html' | fabric --pattern summarize
# ONE SENTENCE SUMMARY:
GitHub introduces code scanning autofix in public beta, leveraging AI to suggest fixes for vulnerabilities in several programming languages.
# MAIN POINTS:
1. Code scanning autofix is now in public beta for GitHub Advanced Security customers.
2. The feature uses GitHub Copilot, CodeQL, and OpenAI GPT-4 to generate code suggestions.
3. It covers over 90% of alert types for JavaScript, Typescript, Java, and Python.
4. More than two-thirds of found vulnerabilities can be remediated with the provided suggestions.
5. The tool will expand to include C# and Go languages in the future.
6. Autofix aims to resolve vulnerabilities by suggesting potential fixes and explanations.
7. Suggestions can affect multiple files and include dependency changes.
8. Developers are advised to review suggestions carefully due to potential limitations.
9. Limitations include incorrect syntax, location, semantics changes, partial fixes, and insecure dependencies.
10. There's a risk of introducing malicious software through suggested dependencies.
# TAKEAWAYS:
1. GitHub's autofix tool significantly aids in vulnerability remediation by providing actionable code suggestions.
2. The integration of AI technologies like OpenAI GPT-4 enhances the accuracy of these suggestions.
3. Developers must critically evaluate autofix suggestions to avoid introducing new issues.
4. The tool's future expansion to more languages promises broader applicability.
5. Awareness of the tool's limitations is crucial for safe and effective use.
List example:
ZMBPM1➜ fabric : np4k_helper ✘ :✭ ᐅ np4k --file test.txt --output kvp | fabric --pattern extract_article_wisdom
## SUMMARY
Alexander Konovalov discusses the rise of AI voice cloning scams, highlighting Americans' $10 billion loss last year and offering tips to avoid falling victim.
## IDEAS:
- Americans lost a record $10 billion to scams last year, with AI voice cloning scams becoming increasingly sophisticated.
- AI voice cloning scams have been used to impersonate public figures like Joe Biden and Taylor Swift.
- One in three adults confess they aren’t confident they’d identify a cloned voice from the real thing.
- Google searches for ‘AI voice scams’ soared by more than 200 percent in a few months.
- Laughing during a call can help identify AI, as it struggles with recognizing laughter.
- Testing reactions by saying something unexpected can reveal if you're talking to AI.
- Listening for anomalies like unusual background noises can indicate a conversation with AI.
- Verifying the caller's identity is crucial, especially when discussing sensitive subjects.
- Avoid oversharing personal information online or over the phone to prevent scammers from impersonating you.
- Treating urgency with skepticism is important, as scammers use pressure tactics to exploit victims.
- Scammers only need three seconds of audio to clone a person’s voice for scam calls.
- 77% of AI voice scam victims lose money.
- AI voice scams can lead to virtual kidnapping and unauthorized access to financial accounts.
- Caller ID spoofing and voice cloning make answering unknown phone calls extremely dangerous.
- Aura offers advanced protection against scams, including AI scam call blocking.
- Fake kidnapping phone scams and grandparent scam calls are common AI voice cloning scams.
- Fake celebrity endorsement videos and scammers cloning your voice to access accounts are growing concerns.
- Creating a family “safe word” and using strong passwords with two-factor authentication can protect against phone scams.
## QUOTES:
- "Americans lost a record $10 billion to scams last year — and scams are getting more sophisticated."
- "AI has a hard time recognizing laughter, so crack a joke and gauge the person’s reaction."
- "Listen out for unusual background noises and unexpected changes in tone, which may be a result of the variety of data used to train the AI model."
- "Scammers often use urgency to their advantage, pressuring victims into acting before they have time to spot the red flags."
- "Scammers only need three seconds of audio to 'clone' a person’s voice to use in scam calls."
- "Between caller ID spoofing and voice cloning, answering unknown phone calls has become extremely dangerous."
- "Aura’s award-winning digital security solution uses advanced technology to block scam calls, warn you of phishing attacks, and protect your identity and finances."
- "An FBI special agent in Chicago reported that families in the USA lose an average of $11,000 in every fake kidnapping scam."
- "Florida investor Clive Kabatznik had a lucky escape when fraudsters used AI voice cloning to impersonate him."
- "Aura safeguards you and your family with award-winning identity theft and credit protection."
## FACTS:
- Americans lost $10 billion to scams in one year.
- Google searches for ‘AI voice scams’ increased by over 200% in a few months.
- Scammers need only three seconds of audio to clone a voice for scam calls.
- 77% of AI voice scam victims end up losing money.
- AI voice scams can lead to virtual kidnapping and unauthorized access to financial accounts.
- Aura offers an AI-powered digital security solution that includes scam call blocking.
- Families in the USA lose an average of $11,000 in every fake kidnapping scam.
- Grandparent scams have caused several grandmothers in Canada to lose thousands of dollars.
- Fake celebrity endorsement videos have tricked consumers into buying illegitimate products.
- Scammers can use cloned voices to access bank accounts and steal savings.
## REFERENCES:
- vidby AG
- YouGiver.me
- Aura
- Federal Trade Commission
- McAfee's "The Artificial Imposter" report
- CNBC Make It
- NBC News
- The New York Times
## RECOMMENDATIONS:
- Use laughter during calls to test if you're speaking with AI.
- Say something unexpected to test the caller's reaction for authenticity.
- Listen for anomalies in the call that could indicate AI involvement.
- Verify the caller's identity by asking for specific details only the real person would know.
- Avoid oversharing personal information online or over the phone.
- Treat urgent requests with skepticism and verify any claims independently.
- Create strong, unique passwords for each account and enable two-factor authentication (2FA).
- Set up bank alerts for any login attempts or account changes.
- Discuss a plan with friends and family, including a code word for phone verification.
- Sign up for an all-in-one digital security service like Aura for comprehensive protection against scams.
Where:
ZMBPM1➜ fabric : np4k_helper ✘ :✭ ᐅ cat test.txt
https://securityboulevard.com/2024/03/guest-essay-a-diy-guide-to-recognizing-and-derailing-generative-ai-voice-scams/
https://www.aura.com/learn/ai-voice-scams
https://www.cnbc.com/2024/01/24/how-to-protect-yourself-against-ai-voice-cloning-scams.html
Corresponding KVP output can be found on this gist: https://gist.github.com/nopslip/855cf8732fbabff761671c8dc7cb8251
And an example of JSON output for a different article for reference: https://gist.github.com/nopslip/4b5f1158f0a86429f95d07fccdd13f33
This looks great @nopslip - Can you rebase to fix the conflicts?
@ksylvan done! It should be good to go.
@danielmiessler @xssdoctor What do you think about this? Adding a scraper tool to the set of commandline tools available in fabric would be awesome.
Pinging for an update on this. @xssdoctor @danielmiessler can we get this in fabric?
Right now I use a hodgepodge of other tools like curl | html2text or curl | pandoc -f html -t plain to get web pages into fabric in a usable form.
We're moving to Go soon so we'll want to do this another way. But this is amazing. Thank you.