CONCAT ERROR(S) when running CN algorithm
I have the right packages installed and the correct Python version, so why does this error come up when running the CN algorithm?
MERGE ERROR(S) AT: /burg/home/sgg2140/communitynotes/sourcecode/scoring/pflip_plus_model.py, in _compute_scoring_cutoff, at line 348: scoringCutoff = scoringCutoff.merge(cutoffByRatings[[c.noteIdKey, "ratingMin"]]) PandasTypeError: Type expectation mismatch on noteId: found=Int64 expected=int64 PandasTypeError: Input mismatch on noteId: left=int64 vs right=Int64 (UNALLOWED) PandasTypeError: Merge key mismatch on noteId: left=int64 vs right=Int64 (UNALLOWED) PandasTypeError: Output mismatch on noteId: result=int64 expected=None (UNALLOWED)
INFO:birdwatch.constants:Fitting pflip model elapsed time: 11.41 secs (0.19 mins)
Traceback (most recent call last):
File "/burg/home/sgg2140/communitynotes/sourcecode/main.py", line 33, in
Idk lol glhf
Sarah Grevy Gotfredsen is encountering a MERGE ERROR(S) which is a
PandasTypeError related to a type expectation mismatch on noteId (found
Int64 expected int64) when running the CN algorithm.
On Mon, Jun 16, 2025, 5:11 AM Sarah Grevy Gotfredsen < @.***> wrote:
SarahGrevy created an issue (twitter/communitynotes#345) https://github.com/twitter/communitynotes/issues/345
I have the right packages installed and the correct Python version, so why does this error come up when running the CN algorithm?
MERGE ERROR(S) AT: /burg/home/sgg2140/communitynotes/sourcecode/scoring/pflip_plus_model.py, in _compute_scoring_cutoff, at line 348: scoringCutoff = scoringCutoff.merge(cutoffByRatings[[c.noteIdKey, "ratingMin"]]) PandasTypeError: Type expectation mismatch on noteId: found=Int64 expected=int64 PandasTypeError: Input mismatch on noteId: left=int64 vs right=Int64 (UNALLOWED) PandasTypeError: Merge key mismatch on noteId: left=int64 vs right=Int64 (UNALLOWED) PandasTypeError: Output mismatch on noteId: result=int64 expected=None (UNALLOWED)
INFO:birdwatch.constants:Fitting pflip model elapsed time: 11.41 secs (0.19 mins) Traceback (most recent call last): File "/burg/home/sgg2140/communitynotes/sourcecode/main.py", line 33, in main() File "/burg/home/sgg2140/communitynotes/sourcecode/scoring/runner.py", line 269, in main return _run_scorer(args=args, dataLoader=dataLoader, extraScoringArgs=extraScoringArgs) File "/burg/home/sgg2140/communitynotes/sourcecode/scoring/pandas_utils.py", line 682, in _inner retVal = main(*args, **kwargs) File "/burg/home/sgg2140/communitynotes/sourcecode/scoring/runner.py", line 222, in _run_scorer scoredNotes, helpfulnessScores, newStatus, auxNoteInfo = run_scoring( File "/burg/home/sgg2140/communitynotes/sourcecode/scoring/run_scoring.py", line 1987, in run_scoring ) = run_prescoring( File "/burg/home/sgg2140/communitynotes/sourcecode/scoring/run_scoring.py", line 1268, in run_prescoring pflipPlusModel.fit(notes, ratings, noteStatusHistory, prescoringRaterModelOutput) File "/burg/home/sgg2140/communitynotes/sourcecode/scoring/pflip_plus_model.py", line 1566, in fit self._prepare_note_info( File "/burg/home/sgg2140/communitynotes/sourcecode/scoring/pflip_plus_model.py", line 1054, in _prepare_note_info scoringCutoff = self._compute_scoring_cutoff( File "/burg/home/sgg2140/communitynotes/sourcecode/scoring/pflip_plus_model.py", line 349, in _compute_scoring_cutoff assert len(scoringCutoff) == beforeMerge AssertionError srun: error: g276: task 0: Exited with exit code 1 ERROR: main.py failed
— Reply to this email directly, view it on GitHub https://github.com/twitter/communitynotes/issues/345, or unsubscribe https://github.com/notifications/unsubscribe-auth/BFQKOHA4VZWU3Y4S4D7IZR33D2C4LAVCNFSM6AAAAAB7MWOX7GVHI2DSMVQWIX3LMV43ASLTON2WKOZTGE2DSMJXHE3DGNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Is anyone else having this issue? Not seeing this on our end, so one quick thing to check is whether you are running with data files that match (e.g. you might expect this error if you were running with a note status history file that was downloaded on a different day as your other files e.g. all the ratings files).