Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError - Missing Chromosome name #4

Open
patrickCNMartin opened this issue Oct 10, 2019 · 4 comments
Open

KeyError - Missing Chromosome name #4

patrickCNMartin opened this issue Oct 10, 2019 · 4 comments

Comments

@patrickCNMartin
Copy link

Hi,

I have been trying to use FactorNet on Drosophila data but to no avail. I have been getting the following error:

python FactorNet-master/train.py -i Training/TrainingScores_chr3R.bed -vi Validation/Validation_Peaks_chr2R.bed -k 128 -r 128 -d 256 -oc Training_output
Multi-task training
output directory (Training_output) already exists so it will be clobbered
Loading genome
Traceback (most recent call last):
  File "FactorNet-master/train.py", line 301, in <module>
    main()
  File "FactorNet-master/train.py", line 190, in main
    genome = utils.load_genome()
  File "/home/pm16057/FactorNet/FactorNet-master/utils.py", line 349, in load_genome
    onehot_chroms = parmap.map(get_onehot_chrom, chroms)
  File "/usr/lib/python2.7/site-packages/parmap/parmap.py", line 304, in map
    return _map_or_starmap(function, iterable, args, kwargs, "map")
  File "/usr/lib/python2.7/site-packages/parmap/parmap.py", line 248, in _map_or_starmap
    output = result.get()
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 554, in get
    raise self._value
KeyError: 'chr2RHet'

From my understanding of the problem this is due to missing names is some python dictionary.

I was wondering if it could be that some dependency does not support chromosome names from other species or if there is a way around this issue that I have not been able to find?

I have changes files in the resources folder and made changes to utils/.py but to no avail.

Thank you!

Kind Regards

Patrick

@daquang
Copy link
Contributor

daquang commented Oct 23, 2019

Hi Patrick,

Thank you for your interest in FactorNet. Unfortunately, I have left academia and I am no longer supporting this package. However, I have been developing new packages, GenomeLoader and PillowNet, which serve as successors to FactorNet. Unlike FactorNet, which uses a CNN, PillowNet uses a U-Net, which in my opinion is a better way of modeling sequence data. I haven't really been updating the documentation for the Github repositories, since I've been busy getting them working on my new job's platform (I work for DNAnexus, btw. Our cloud platform makes desktops and HPCs for research almost obsolete. GPUs are so easy with this). The repositories already install pretty easily on our platform, although they should install just as easily on a desktop or properly maintained HPC cluster. Let me know if you'd like to discuss getting them working for you. The repositories are found here:

/~https://github.com/daquang/GenomeLoader
/~https://github.com/daquang/PillowNet

-Daniel

@jhawe
Copy link

jhawe commented Mar 26, 2020

Hi,

a real pitty FactorNet is not maintained anymore.

Will there be a paper/extensive README for PillowNet?

As we also struggle quite a bit with FactorNet, might be worthwhile to try out PillowNet.

Would you suspect that both tools should yield similar results given the same input data?

Thanks and best,
Johann

EDIT: about DNAnexus, I suspect this is not freely available is it? So use in academia might be really limited?

@daquang
Copy link
Contributor

daquang commented Mar 27, 2020

I actually wrote PillowNet while I was in my postdoc under the MIT license, so it is freely available. I expect the results are not too different from FactorNet, and hopefully they'll actually be better.

Originally PillowNet was going to have its own paper, but I left academia too early. It was used in my last paper, and you should cite that paper: https://www.sciencedirect.com/science/article/pii/S2212877819309573

I should definitely update PillowNet with a proper README, especially if there's a demand for it.

@jhawe
Copy link

jhawe commented Apr 2, 2020

Thanks for the replies! We'll have a look at it.
Quick question: FactorNet, as far as I know, uses pure DNA sequence in addition to other signal information. Is this also done in PillowNet (or in the paper you pointed to)?

cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants