-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
train using different size of image #4
Comments
Hi, have you modified the argument "img-size" in the train.py, which should be set to your input image size? What else, if you use a size different with 512, you should also modify the configuration settings (displays as follows) of C3ResAtnMHSA modules in .yaml if there has. The parameter corresponds to the size of relative positional encoding in Multi-Head Self Attention module. You can try to modify it referring to its ratio with 512. E.g., 16 is 512 divided by 32, then you may try setting it to 417 // 32 = 13. However, as 417 is not fully divisible by 32, the above C3ResAtnMHSA setting may still go wrong. Then you may need to set breakpoints in the code to check out the size of output features. You can also try to replace the params in the figure below (16, 32, 64) with (14, 28, 56), which is my simple deduction and not guaranteed accurate. |
Hi,
please advise thank you |
I have revisited the code, and found that the input image size should be even strictly (or the cat dimensions in Focus module of YOLOv5 will be inconsistent). So, you should set "img-size" to 416 or other even numbers, after that, do a global ,py files search for the number 512 and change all of them to 416. Then, if use 416, you should again modify the params in .yaml from (16, 32, 64) to (13, 26, 52). It should be worked! |
Hi, finally I tried to use 512 image size and got this error :
this is the command that I used: Please advise thank you |
It seems it is caused by the missing of the matching dgimgs. You can assert the codeline That is, from:
to:
|
Ok thank you so much for your answers. However, I am facing the new issue as follow :
please advise, thank you |
I'm not sure what caused the problem. Did you modify the values returned by the function "load_image(self, index)"? Or try to delete the ".cache" files under the train/val/test directories to refresh the cache mechanism of Yolov5. |
I ensure the load image is not changed and deleted cache. here is the new error:
is there any way to find source of error? thank you |
It seems it is still caused by the missing of degraded images. I suggest to again check that each image has its corresponding degraded image in the train/val/test datasets. |
I would like to clarify, so degrade folder must be exist in each train/val/test? I only created degrade folder inside train. |
Yes, each should comprise degrade folder. The whole structure can also refer to #2. The degraded images in val or test are for the possible use of loss calculation. As in the practical use, we cannot refer to the target groundtruth, thus the degraded images cannot be properly generated which can lead to the above degrade images missing error. To tackle the practical use, as we didn't write a specific inference code, you may need to take some efforts to modify the codes, e.g., comment out some lines. Or, a more simple way is just to generate a series fake degraded images, such as constant value or random values. |
Hi, I just finished the training process. Now, how to load the model and do inference for each image by looping the image path? for example read the image using pil.image or cv2. I see the test.py and detect.py seems there are many steps conducted after passing the image into the network. Please advice |
You can refer to the |
Hi,
I deactivated line 217 in common.py but it result an error as follow:
Please advise thank you |
For the first error, I actually failed to find the mentioned codeline Then, for the second error, it seems it is caused by the different Pytorch version, which can be refered to ultralytics/yolov5#5499 (comment). To fix the error, you may just need to do some modifications on the upsamping.py as here. |
Thanks, for your help. It works ! |
Hi @WindVChen, I apologize to bother you again. I was successful to use 512 image size. I must use image size which is 160 but it gives error as follow :
This is yaml file of the model :
Please advise |
@ramdhan1989 Hi, the raised error is because there is an additional operation of calculating FLOPs and Params in YOLOv5. E.g., here: Lines 97 to 99 in 3325fa6
Thus, you should change the value of the image resolution for that calculation. See the answer here #4 (comment). |
Thanks it is running now. However, I feel strange training performance after modifying all the default 512 with 160. using default 512 I got GFLOPS 18.1 while size 160 I only got 1.81 GFLOPS. I am not sure if it is related or not. Thanks |
The variation of Flops is normal, as FLOPs is positively correlated with feature map size. Therefore, the larger the input image, the larger the FLOPs. |
Ok noted, I am curious in DegradeGenerate.py
If I reduced image size to 160, do I need to change minDis from 130 to another value ? |
This value is set to ensure not too blurry for a pixel far away from object targets. If you apply the method on high-resolution images such as <4m, the current value seems fine. If on low-resolution images such as >16m, the value should be set smaller. Another thing you may need to pay attention is the Degrade Function design. As objects in Levir-Ship dataset are mostly under 20 pixels, the current function is designed to ensure not blurring the pixels in the object area. Therefore, you may need to modify the function to adapt to your own dataset. (About the advice of the function design, you can refer to Fig. 8 in the paper.) |
I am trying to elaborate your suggestions with the paper and your comment here. Let me show you some examples. The image size is 160*160 and the values below are normalized. for this case, I will put minDis 140 with assumptions pixel size roughly around 24 (160*0.15). So my boxsize would be 31 (1.03^140)/2 which is in between 24 and 40 (1/4 image size).
is my configuration above good enough? thanks |
Sorry for my late reply. The setting of minDis seems OK. You can give it a try, and I think the difference of minDis will not affect the final result too much. What may matter more is still the design of Degrade Function. For example, in your given Example 1, I notice that the samples cover a large range, even reach 0.6*160=96 pixels. Then, the current Function is not suitable, as from #3 (comment) we can see that for pixels whose distance to object target is larger than 20, the degrade operation will be applied. Thus, the large object targets in Example 1 will be partially blurred, which may lead to accuracy drop. Therefore, I suggest to modify the Function to keep more object targets clear. While for Example 3 whose maximum size of target can only reach 0.06*160=9.6, the current Function will keep much background unblurred, so you also need to modify the function. |
for Example 1, I changed the function of kernel size to y=1.0066^x assuming the largest object is 96 pixel ~ 1.88 based on eq thanks |
hi, I tried to use 417*417 images but it returns error message as follow :
is there any lines that I need to modify or change to train with images other than 512*512?
thanks
The text was updated successfully, but these errors were encountered: