SMF icon indicating copy to clipboard operation
SMF copied to clipboard

Check in the welcome screen if duplicate members exists

Open albertlast opened this issue 5 years ago • 24 comments

Since we believe that the member_name and email_address is unique we should check this. got the idea by reading the issue: https://github.com/SimpleMachines/SMF2.1/issues/6207

The result is shown in the welcome screen: grafik

albertlast avatar Aug 01 '20 06:08 albertlast

I don't think we want to tell admins that may or may not have experience running database queries to run certain queries on their system. There should be another, more appropiate solution to this problem.

frandominguezl avatar Aug 01 '20 09:08 frandominguezl

Than provide a better solution.

albertlast avatar Aug 01 '20 09:08 albertlast

I analyzed the behaviour of the issue.

When you click register two times really fast, you complete the registration proccess and end up logged in with the account of highest ID. Let's say, if both accounts created have IDs 5 and 6, right after the registration you will be logged in with ID 6.

If you logout and then login with the account's credentials, you will be logged in with ID 5.

We can expect to be the account with higher activity the one with the lowest ID, so we, by default should tell the user what's the problem and how we meant to solve it (by deleting the last account of each duplicate). If the user agrees, he can proceed with the proccess. If not, we can suggest as well to ask for support.

The main option shouldn't be telling the user to fix the problem by themselves.

frandominguezl avatar Aug 01 '20 09:08 frandominguezl

So we should change the error text and remove the sql statement part?

albertlast avatar Aug 01 '20 09:08 albertlast

Yes. We should inform the user what we're going to do, and then he'll be able to decide if he wants to continue with the upgrade process or not.

frandominguezl avatar Aug 01 '20 11:08 frandominguezl

Along side a backend fix I would also like to add a frontend fix. Perhaps a checkSubmitOnce() call and disable the button via JS on first click.

MissAllSunday avatar Aug 01 '20 14:08 MissAllSunday

@MissAllSunday this sounds like a great first check tbh. That would fix 99.9% of all future duplicate registrations. Only bots or someone intentionally trying to do that would succeed after this change.

Butterflysaway avatar Aug 01 '20 19:08 Butterflysaway

@frandominguez03 Just to add - this bug can be affected by x amount of duplicate entries. I found up to 9x duplicates in my database. with a total of 744 members containing duplicates of a total of 1618 entries.

Maybe something like this:

This simply removes any entries where id_member > first id_member and member_name is identical.

There may be times when this does not catch all duplicate email_addresses if someone changed their member_name after signing up with the duplicate accounts if you allow them in your board to do so.

You guys might want to implement a better way to delete the accounts so you can remove any posts, personal messages, alerts, etc.

I ran this on my board to clear up the duplicates. You can use this and make a pull request if you want or modify it in anyway. Just a suggestion on what i personally did.

$firstEntry = array();
$request = $smcFunc['db_query']('', "
	SELECT duplicates.id_member, duplicates.member_name
	FROM {db_prefix}members AS duplicates
	WHERE duplicates.member_name IN (
		SELECT member_name
		FROM {db_prefix}members
		GROUP BY member_name
		HAVING COUNT(member_name) > 1
	) ORDER BY member_name ASC, id_member ASC;
");
while ($row = $smcFunc['db_fetch_assoc']($request))
{
	if(empty($firstEntry[$row['member_name']]))
	{
		$firstEntry[$row['member_name']] = $row['id_member'];
	}
}
foreach ($firstEntry as $k => $v)
{
	$smcFunc['db_query']('', "
		DELETE FROM {db_prefix}members
		WHERE member_name = {string:member_name} AND id_member > {int:id_member}",
		array(
			'member_name' => $k,
			'id_member' => $v,
		)
	);
}

Butterflysaway avatar Aug 01 '20 23:08 Butterflysaway

Since you got a modern mariadb version you can us this query to delete:

delete from smf_member where id_member in (
	select id_member from (
		select member_name, 
		id_member, 
		row_number() over(partition by member_name order by id_member) num 
		from smf_members
	) a
	where  a.num > 1
)

replace member_name with email_address and done.

since we support rdbms from the stoneage a smf solution would be this:

delete from smf_member where id_member in (
	select id_member from (
		select a.id_member 
		from smf_members as a
		join (
			select member_name, 
			min(id_member) id_member, 
			count(*)
			from smf_members
			group by member_name
			having count(*) > 1
		) as b on (a.member_name = b.member_name and a.id_member > b.id_member)
	)
)

albertlast avatar Aug 02 '20 04:08 albertlast

When should the cleaning in the upgrade process happen, should be this happen than autmaticaly or should some king of button for cleaning process?

albertlast avatar Aug 02 '20 05:08 albertlast

@albertlast I believe it should all take place in RC3. This way even the people who use current github repo will be able to upgrade using the official smf upgrade files once that version is pushed. This isn't a huge problem, just an annoying one. I would personally take a much more substantial approach to filtering out which accounts to remove which includes removing posts, personal messages, alerts, etc. I don't plan on removing the duplicated users on my live version anytime soon at least until this is officially added to smf. As the bug spans back to the creation of SMF (1.0), then virtually every board that runs smf has this problem and it needs to be addressed properly. The easiest way I've found to stop this from happening again is enabling recaptcha v2 on the forum as it prevents the same form from being submitted 2 times which should be sufficient for now until the permanent fix is done.

Butterflysaway avatar Aug 02 '20 06:08 Butterflysaway

This was not the quest, i asked for which step in the upgrade process should this happen and how.

albertlast avatar Aug 02 '20 07:08 albertlast

I would prefer to add a scheduled task to handle this scenarios.

I personally wouldn't run any query directly on the upgrade process. Check and detect, yes and If something is detected, inform the user about it and continue with the process.

Once the process is complete, a reminder on the admin panel to run that scheduled task to take care of those duplicated entries.

MissAllSunday avatar Aug 04 '20 20:08 MissAllSunday

so this wouldn't fix the issue, would be only a workaround.

albertlast avatar Aug 04 '20 20:08 albertlast

The check is already there, so are the txt strings that can be modified to inform the user. The only thing left is building the scheduled task.

MissAllSunday avatar Aug 04 '20 20:08 MissAllSunday

@MissAllSunday If your going to put the unique index on there, you can't have a scheduled task fix this. This will cause the operation to fail during the upgrade.

jdarwood007 avatar Aug 06 '20 00:08 jdarwood007

Does the upgrade process gets interrupted if duplicates are found? I still don't think the admin should be doing this during the upgrade, it is already convulsive enough to add yet another layer of complexity on top of it.

MissAllSunday avatar Nov 06 '20 22:11 MissAllSunday

Than came up with a better idea and not decline the first one so long.

in short the upgrade process get not interpruted, since the check is placed in the welcome step.

albertlast avatar Nov 07 '20 11:11 albertlast

The idea is fine. I want to know if duplicate ids are a problem doing the upgrade or if they prevent it, if not, this process can be moved to an admin section where the admin can decide when to do this rather than doing it on the upgrade.

MissAllSunday avatar Nov 07 '20 14:11 MissAllSunday

This pr could changed the way to delay, but the related #6209 expect this happen before the upgrade process is started.

albertlast avatar Nov 07 '20 14:11 albertlast

Cool, this is what I wanted to know. Now, how about do the check, if there are duplicates store the result somewhere in a modSettings value then add an admin interface to let the admin do the process whenever its more convenient to them.

We already have some admin tools for recounting stuff, convert to utf8, prune logs etc etc. WE can add an interface for admins to add the unique index and also offer some more alternatives on how they want to manage duplicates, something like:

We found 3 duplicates which one you want to keep or merge them into a single account.

I think we can offer more tools and options if we move this to an admin page rather than doing it on the upgrade process.

MissAllSunday avatar Nov 07 '20 14:11 MissAllSunday

Than please provide a admin page where i could lay down this feature. Or tell me which of the existing one should be used.

albertlast avatar Nov 07 '20 14:11 albertlast

There is none. A new interface will have to be added, if you could add it great if not we can do it but it will have low priority.

MissAllSunday avatar Nov 07 '20 14:11 MissAllSunday

Since it's your idea, i let you doing the thing, please highlight me back when your interface is done.

albertlast avatar Nov 07 '20 14:11 albertlast

This pull request targets release-2.1, so it doesn't belong in the 3.0 milestone, and we aren't going to make this change now in the 2.1 upgrader.

That said, if a new PR implementing these changes and the ones in #6209 is submitted for release-3.0, that would be worthwhile.

Sesquipedalian avatar May 14 '24 21:05 Sesquipedalian