Segmentation fault #20

linda5mith · 2023-10-09T16:38:58Z

Hi there, I'm trying to run and all vs. all of around 10k sequences which are on average 100kb in length.

I've tried running with various combinations of -c 8 or not specifying any cores (server has 24 cores) but keep getting the error "Segmentation fault (core dumped)" each time.

~/programs/Identity/bin/identity -d pooled_genomes.fasta -o output.txt -t 0.9 -c 8

I'm not sure what I'm doing wrong. Any help would be greatly appreciated! :)

linda5mith · 2023-10-10T11:11:18Z

Tried running meshclust with smallest batch size but am still getting segmentation error:

/home/administrator/programs/Identity/bin/meshclust -d pooled_genomes.fasta -o meshclust.clstr -t 0.9 -c 8 -e y -a n -p 10 -b 1000 -v 1000

  MSE: 0.000172281
Optimizing ...
Validating ...
        MAE: 0.00986641
        MSE: 0.000248818

Clustering ... 

Data run 1 ...
        Processed sequences: 1000
        Unprocessed sequences: 0
        Found centers: 55
Segmentation fault (core dumped)

Is there any way I can get this working on my system?

valentynbez · 2024-01-18T08:36:07Z

I have the similar problem, but in a different step:

./identity -d (gunzip -c crc_phages/crc_phages.mvirs.fa.gz | psub) -t 0.7 -o output.txt -c 8

Identity 1.2 is developed by Hani Z. Girgis, PhD.

This program calculates DNA sequence identity scores rapidly without alignment.

Copyright (C) 2020 Hani Z. Girgis, PhD

Academic use: Affero General Public License version 1.

Any restrictions to use for profit or non-academics: Alternative commercial license is required.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Please contact Dr. Hani Z. Girgis (hzgirgis@buffalo.edu) if you need more information.

Please cite the following papers: 
        Identity: Rapid alignment-free prediction of sequence alignment identity scores using
        self-supervised general linear models (2021). Hani Z. Girgis, Benjamin T. James, and
        Brian B. Luczak. NAR Genom Bioinform, 13(1), lqab001.

        A survey and evaluations of histogram-based statistics in alignment-free sequence
        comparison (2019). Brian B. Luczak, Benjamin T. James, and Hani Z. Girgis. Briefings
        in Bioinformatics, 20(4):1222–1237.

Database file: /tmp/.psub.TOS7qnPQj6
Query file: Not provided
Output file: output.txt
Cores: 8
Threshold: 0.7
Automatically relax threshold: Yes
All vs. all: Yes

Average: 14000
K: 6
Histogram size: 4096
A histogram entry is 32 bits.
Generating data.
fish: Job 1, './identity -d (gunzip -c /nfs/n…' terminated by signal SIGSEGV (Address boundary error)

linda5mith closed this as not planned Won't fix, can't repro, duplicate, stale Oct 10, 2023

linda5mith reopened this Oct 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault #20

Segmentation fault #20

linda5mith commented Oct 9, 2023

linda5mith commented Oct 10, 2023 •

edited

Loading

valentynbez commented Jan 18, 2024

Segmentation fault #20

Segmentation fault #20

Comments

linda5mith commented Oct 9, 2023

linda5mith commented Oct 10, 2023 • edited Loading

valentynbez commented Jan 18, 2024

linda5mith commented Oct 10, 2023 •

edited

Loading