allow_duplicate_genes not working
Hi!
I am trying to solve TSP with GA and it seems like allow_duplicate_genes is not working.
Reproduction: TSP with 32 citites, each city is represented by number [0, ..., 31]
ga_instance = pygad.GA(num_generations=5,
num_parents_mating=2,
fitness_func=fitness,
init_range_low=0,
init_range_high=32,
num_genes=32,
gene_space=a = np.arange(0,32,1),
gene_type=int,
allow_duplicate_genes=False,
)
a = ga_instance.run()
solution, solution_fitness, solution_idx = ga_instance.best_solution()
print(f'{solution}')
solution.sort(axis=0)
print(solution)
It gives: [25 15 20 1 30 1 19 13 29 10 28 3 24 12 12 5 0 26 26 6 7 2 23 16 20 18 8 11 18 3 17 26] [ 0 1 1 2 3 3 5 6 7 8 10 11 12 12 13 15 16 17 18 18 19 20 20 23 24 25 26 26 26 28 29 30]
As you see numbers 1, 3, 12, 18, 20, 26 are duplicated
Hi,
Thanks for using PyGAD!
I have some comments on your code:
You set gene_space=a = np.arange(0,32,1) which is not valid. Where is the variable a? I wonder if that code is working.
The parameter sol_per_pop is missing. This one and the num_genes must exist as long as the initial_population parameter is not used.
I am using the latest version of PyGAD and I did not see any duplicates while allow_duplicate_genes=False. Note that I built a fitness function that returns random fitness values.
This is the code I tested where I find the difference between the following 2 sets:
- The set of unique values in the solution.
- The set of unique gene values (i.e.
np.arange(0,32,1))
As long as you use 32 genes and the gene space has only 32 values, then it is expected that the difference between those 2 sets must be empty. This is what happens in my code. So, I think there is no issue with the allow_duplicate_genes parameter.
If my code does not reflect yours, please let me know.
import pygad
import numpy as np
def fitness(sol, idx):
ss = set(np.unique(sol))
r = set(np.arange(0,32,1)) - ss
print(r)
if len(r) > 0 :
print("\n\nSomething is WRONG\n\n")
return np.random.rand()
ga_instance = pygad.GA(num_generations=50,
num_parents_mating=2,
fitness_func=fitness,
init_range_low=0,
init_range_high=32,
sol_per_pop = 10,
num_genes=32,
gene_space=np.arange(0,32,1),
gene_type=int,
allow_duplicate_genes=False)
ga_instance.run()
solution, solution_fitness, solution_idx = ga_instance.best_solution()
# print(f'{solution}')
solution.sort(axis=0)
# print(solution)
ss = set(np.unique(solution))
r = set(np.arange(0,32,1)) - ss
print(r)
if len(r) > 0 :
print("\n\nSomething is WRONG\n\n")
Yes, it is working. The cause was lack of a gene_space parameter . Thank you for the response and this amazing library
ga_instance = pygad.GA(num_generations = num_generations,
num_parents_mating = num_parents_mating,
sol_per_pop = population_size,
fitness_func = fitness_function,
num_genes = list_size,
gene_type = int,
gene_space = np.arange(0,list_size,1),
allow_duplicate_genes = False,
mutation_type = None,
on_start=on_start,
on_fitness=on_fitness,
on_parents=on_parents,
on_crossover=on_crossover,
on_mutation=on_mutation,
on_generation=on_generation,
on_stop=on_stop,
save_solutions = True)
ga_instance.run()
print('From this')
print(ga_instance.initial_population)
print('To this...')
print(ga_instance.population)
And i getting solutions with duplicated genes like:
[10 3 14 4 17 10 5 0 7 6 11 8 15 16 13 1 14 4 2 9]]
Mutation is not enabled, but i guess there is something i'm missing... Should allow_duplicate_genes also block duplicates after the mating?
Thank
ga_instance = pygad.GA(num_generations = num_generations, num_parents_mating = num_parents_mating, sol_per_pop = population_size, fitness_func = fitness_function, num_genes = list_size, gene_type = int, gene_space = np.arange(0,list_size,1), allow_duplicate_genes = False, mutation_type = None, on_start=on_start, on_fitness=on_fitness, on_parents=on_parents, on_crossover=on_crossover, on_mutation=on_mutation, on_generation=on_generation, on_stop=on_stop, save_solutions = True) ga_instance.run() print('From this') print(ga_instance.initial_population) print('To this...') print(ga_instance.population)And i getting solutions with duplicated genes like:
[10 3 14 4 17 10 5 0 7 6 11 8 15 16 13 1 14 4 2 9]]
Mutation is not enabled, but i guess there is something i'm missing... Should allow_duplicate_genes also block duplicates after the mating?
Thank
@KevinGalassi, allow_duplicate_genes works only after the mutation is applied. The reason is that even if there is a duplicate, then it can be solved using mutation because it can generate new values for a gene to solve the duplicate.
But for crossover, it only combines the genes from 2 solutions. Crossover is not meant to introduce new gene values by its own.
But I think it would be a good feature to support. A warning maybe used if mutation is disabled while allow_duplicate_genes=False.
My bad, when I looked at the wiki I haven't found this information explicitly declared. I avoided mutation because the possibility of multiple genes with the same value, but the same problem may arise with crossover too.
BTW I'm trying to solve a kind of 'Travelling Saleman Problem', guess I'll look online.
Thanks
I might be doing something wrong but allow_duplicate_genes=False is not working for me, even the best solutions for the fitness function I am using have duplicate genes.
For my case I am trying a fitness function that takes around 20 min, but here with a dummy fitness function also returns solutions with duplicated genes as the ones to be printed at the end:
def Genes_Trial(x, x_idx):
rng_noise = np.random.default_rng(678910)
dummy_fit = rng_noise.random()*100
x = np.sort(x)
return dummy_fit
gene_space = np.arange(1,41,1)
ga_instance = pygad.GA(num_generations = 300,
num_parents_mating = 40,
sol_per_pop = 50,
num_genes = 6,
init_range_low = gene_space[0],
init_range_high = gene_space[-1],
gene_space = gene_space,
gene_type = int,
keep_elitism = 2,
mutation_probability = 0.025,
fitness_func = Genes_Trial,
save_solutions = False,
allow_duplicate_genes = False,
save_best_solutions = True,
random_seed=12345
)
ga_instance.run()
trial = ga_instance.solutions
trial = np.sort(trial)
unique_genes = []
for i_genes in range(trial.shape[0]):
unique_genes.append(np.unique(trial[i_genes,:]))
for i_sol in range(len(unique_genes)):
if len(unique_genes[i_sol])<n_sensors:print(np.array(ga_instance.solutions[i_sol]))
Initially I tried with adaptive mutation and thought that was the problem, then when mutation_type is left to defaults but the mutation_probability is set, there are duplicates. However, when mutation_probability is set to default, no duplicates are generated.
Then, I am not sure how to proceed since I am not sure mutation is happening at all when mutation_type and mutation_probability is set to default.
@gabrieldelpozo,
A new release will be pushed soon with a fix to this issue. It happens as crossover creates duplicate genes that, for sometimes , are not solved.