Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid contig error #93

Open
jalalsiddiqui opened this issue Jan 22, 2021 · 9 comments
Open

Invalid contig error #93

jalalsiddiqui opened this issue Jan 22, 2021 · 9 comments

Comments

@jalalsiddiqui
Copy link

Traceback (most recent call last):
File "/users/PAS1143/osu8165/.conda/envs/clipper3/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/users/PAS1143/osu8165/.conda/envs/clipper3/lib/python3.7/site-packages/clipper/src/call_peak.py", line 932, in call_peaks
subset_reads = list(bam_fileobj.fetch(reference=str(interval.chrom), start=interval.start, end=interval.stop))
File "pysam/libcalignmentfile.pyx", line 1081, in pysam.libcalignmentfile.AlignmentFile.fetch
File "pysam/libchtslib.pyx", line 686, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid contig chr1_KI270708v1_random
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/users/PAS1143/osu8165/.conda/envs/clipper3/bin/clipper", line 8, in
sys.exit(call_main())
File "/users/PAS1143/osu8165/.conda/envs/clipper3/lib/python3.7/site-packages/clipper/src/main.py", line 266, in call_main
main(options)
File "/users/PAS1143/osu8165/.conda/envs/clipper3/lib/python3.7/site-packages/clipper/src/main.py", line 105, in main
peaks_dicts.append(job.get(timeout=options.timeout))
File "/users/PAS1143/osu8165/.conda/envs/clipper3/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
ValueError: invalid contig chr1_KI270708v1_random

@byee4
Copy link
Member

byee4 commented Jan 22, 2021

random contigs were likely trimmed from the clipper references - if it encounters a region that doesn't exist in the reference it might complain. You can see which chromosomes are included for each supported species (eg you can get a unique list of valid chromosomes from hg19 gencode v19 (hg19_exons.bed) using some bash:

link to clipper data/regions

awk -F "\t" '{print $1}' hg19_exons.bed | uniq

@jalalsiddiqui
Copy link
Author

jalalsiddiqui commented Jan 22, 2021 via email

@jalalsiddiqui
Copy link
Author

I ran this for around ~10 hours but I had no output or anything occur. Do you know what the problem might be?

@byee4
Copy link
Member

byee4 commented Jan 26, 2021

I have seen Clipper run for longer than that, we allow 24 hours per run although the vast majority of jobs don't take that long.

@jalalsiddiqui
Copy link
Author

jalalsiddiqui commented Jan 26, 2021 via email

@byee4
Copy link
Member

byee4 commented Jan 26, 2021

I've had some success running it on an ec2 instance with more cores, c3.8xlarge for instance might speed it up some.

@jalalsiddiqui
Copy link
Author

jalalsiddiqui commented Jan 27, 2021 via email

@jalalsiddiqui
Copy link
Author

I am running the pipeline with 6 threads at the moment. Note I am doing a single end analysis with the first read only.

@jalalsiddiqui
Copy link
Author

One more question. Is there a way to monitor progress. There is no output so I am not sure of the progress I am making on the peak calling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants