Matthew Durrant

Results 11 issues of Matthew Durrant

## Expected Behavior I am clustering billions of protein sequences. I already built the database. I was expecting linclust to run fairly quickly, but it seems to get stuck on...

I wanted to try calculating mash distances using my own code. I exported the hashes as integers for two `.msh` files using the `mash info -d` command. When I run...

Hello, I'm not sure why you encoded the following chemistry into your color scheme: Polar = G, S, T, Y, C Neutral = N, Q Basic = K, R, H...

I decided to compare the output of IGGsearch with the output of MIDAS using the default database. I have noticed that the two approaches return quite different results. Here are...

Hi Stephen, I was wondering if you have the sample-specific metagenomes that you assembled available. These would be of great use to me in addition to the representative genomes. Thank...

Do you have any recommendations on how we should normalize the data prior to clustering? What approach did you take?

It's hard to overstate how important it is to understand what it means for data to be 'tidy'. Tidy data is an important concept if you want to make the...

enhancement
after-lesson-release
complex
future goal
type:discussion

Hi, I access `llama3-70b` through groq using this [python tool](https://github.com/simonw/llm). I was hoping to also use groq with gp.nvim. Any plans to support this?

enhancement

Say you hypothetically had three phenotypes - cough, headache, and congestion. These phenotypes will all highly correlate with each other among individuals. Wouldn't this correlation inflate the Bayes factor? How...

I'm aligning 326 protein structures using the commands: ``` reseek -pdb2mega pdb_dir/ -output structs.mega muscle -super5 structs.mega -output output.afa ``` And it ran successfully up until the `Consensus sequences` step...