Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rustdoc] Rustdoc should prevent long file names. #34023

Open
BenTheElder opened this issue Jun 1, 2016 · 14 comments
Open

[rustdoc] Rustdoc should prevent long file names. #34023

BenTheElder opened this issue Jun 1, 2016 · 14 comments
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.

Comments

@BenTheElder
Copy link

BenTheElder commented Jun 1, 2016

I'm not actually certain if handling absurdly long method names should be a goal for rustdoc, but currently they produce filenames that are too long for the filesystem. I discovered this while working with opencv-rust which auto generates c-wrapper methods for opencv, some of which have very long names. A possible proposed solution was that rustdoc abbreviate filenames, I suggest that a good solution might be splitting the name into 255 char chunks.

Eg the problem file: /home/benjamin/dev/opencv-rust/target/doc/opencv/sys/fn.cv_calib3d_cv_solvePnPRansac_InputArray_objectPoints_InputArray_imagePoints_InputArray_cameraMatrix_InputArray_distCoeffs_OutputArray_rvec_OutputArray_tvec_bool_useExtrinsicGuess_int_iterationsCount_float_reprojectionError_int_minInliersCount_OutputArray_inliers_int_flags.html

Would become something like: /home/benjamin/dev/opencv-rust/target/doc/opencv/sys/fn.cv_calib3d_cv_solvePnPRansac_InputArray_objectPoints_InputArray_imagePoints_InputArray_cameraMatrix_InputArray_distCoeffs_OutputArray_rvec_OutputArray_tvec_bool_useExtrinsicGuess_int_iterationsCount_float_reprojectionErr/or_int_minInliersCount_OutputArray_inliers_int_flags.html

Previous discussion on the rust subreddit: https://www.reddit.com/r/rust/comments/4m29tk/solution_to_file_name_too_long_for_cargo_doc/

@BenTheElder
Copy link
Author

BenTheElder commented Jun 1, 2016

Small update: This also appears to be some regression of sorts, after switching back to the 1.8.0 toolchain with rustup I can document the crate as these methods are skipped.

Edit, these methods are marked #[doc(hidden)]. Perhaps then the correct answer is a way to make rustdoc not generate docs for hidden methods?

@GuillaumeGomez
Copy link
Member

Edit, these methods are marked #[doc(hidden)]. Perhaps then the correct answer is a way to make rustdoc not generate docs for hidden methods?Edit, these methods are marked #[doc(hidden)]. Perhaps then the correct answer is a way to make rustdoc not generate docs for hidden methods?

It isn't supposed to.

@BenTheElder
Copy link
Author

As far as I can tell, this method is in
/target/debug/build/opencv-/out/calib3d.extern.rs and marked
#[doc(hidden)].
It is then include!-ed in /target/debug/build/opencv-/out/hub.rs in
a public sys module.

As far as I can tell now, it should be marked hidden and excluded from the
docs, but 1.9.0 and nightly appear to attempt to build docs for these
methods anyhow.
On Jun 1, 2016 15:36, "Guillaume Gomez" notifications@github.com wrote:

Edit, these methods are marked #[doc(hidden)]. Perhaps then the correct
answer is a way to make rustdoc not generate docs for hidden methods?Edit,
these methods are marked #[doc(hidden)]. Perhaps then the correct answer is
a way to make rustdoc not generate docs for hidden methods?

It isn't supposed to.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#34023 (comment),
or mute the thread
/~https://github.com/notifications/unsubscribe/AA4BqyzC_f0KZBTzAcplxERtFU9JXBgxks5qHd8jgaJpZM4Ir3Qa
.

@BenTheElder
Copy link
Author

I've opened a new issue (#34025) to reflect what seems to be the actual problem, but I think that avoiding large file names might also be worth discussing so I will leave this open as well for now.

@retep998
Copy link
Member

retep998 commented Jun 1, 2016

Couldn't we just shorten the name? Come up with some pattern, maybe take inspiration from DOS style short filenames, like shorten it to 100 characters and anything beyond that is replaced with ~1 or a higher number if another file has the same name, or maybe tack on a short hash of the name.

@ssokolow
Copy link

ssokolow commented Jun 1, 2016

I'd go with the hash since, as long as it's well-defined (eg. take a prefix of n - hash_lengthcharacters, then hash the remaining characters using <name of hash>, then append the hash to the prefix), it's possible to re-generate it from the source materials and get the same result if necessary and it'll be more stable across runs, in case incremental rebuild is ever desired.

@apasel422 apasel422 added the T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. label Jun 3, 2016
@steveklabnik steveklabnik added T-dev-tools Relevant to the dev-tools subteam, which will review and decide on the PR/issue. and removed T-tools labels May 18, 2017
@Mark-Simulacrum Mark-Simulacrum added the C-feature-request Category: A feature request, i.e: not implemented / a PR. label Jul 25, 2017
@steveklabnik
Copy link
Member

Triage: no change

@workingjubilee
Copy link
Member

As of Windows 10 patch 1607 (which happened later in 2016, actually), users can now remove the path length limitations, and patches before that are only supported for Enterprise LTSC.

@ehuss ehuss removed the T-dev-tools Relevant to the dev-tools subteam, which will review and decide on the PR/issue. label Jan 18, 2022
@lolbinarycat
Copy link
Contributor

most linux filesystems still have the limitation of 255 bytes per path segment, and 4096 bytes for the entire path, at least according to linux/limits.h.

additionally, it seems that search engines don't like urls longer than 2000 chars

@ssokolow
Copy link

ssokolow commented Nov 8, 2024

@lolbinarycat linux/limits.h is for libc functions that have internal limitations. It does not constrain the filesystems and most Linux filesystems have no path length limit.

See, for example, the getcwd(3) manpage:

getwd() does not malloc(3) any memory. The buf argument should be a pointer to an array at least PATH_MAX bytes long. If the length of the absolute pathname of the current working directory, including the terminating null byte, exceeds PATH_MAX bytes, NULL is returned, and errno is set to ENAMETOOLONG. (Note that on some systems, PATH_MAX may not be a compile-time constant; furthermore, its value may depend on the filesystem, see pathconf(3).) For portability and security reasons, use of getwd() is deprecated.

See also:

The TL;DR is that linux/limits.h defines a limit for paths handled by syscalls but, given that you can always break it by mounting a filesystem with PATH_MAX-length paths on a mountpoint deep inside another filesystem, it cannot be absolute, and programs/libraries which want to bypass that limitation will work around that by reimplementing by walking up the chain of ancestors and assembling the full path in userspace.

(As far as I'm aware, all filesystem operations are now supported by newer relative syscalls like openat which let you circumvent having to work with absolute, canonicalized paths at some point inside the kernel. Just "this path, relative to that FD" or "this path, relative to that inode".)

I can confirm that I can generate test paths on ext4 which are over 5000 characters long, despite PATH_MAX being 4096 on my system... which does cause some things to then fail to get the working directory.

@lolbinarycat
Copy link
Contributor

true, but you'll still encounter issues if you do the trivial approach, and NAME_MAX is still a bit tricky to get around.

@ssokolow
Copy link

ssokolow commented Nov 8, 2024

Certainly. I just think it's important to make it clear that "most linux systems" don't have such a limitation... it's just certain standard library APIs that have the limitation.

(Basically, to avoid the problem I've had to work around with Serde where it can fail to serialize ext4 mtimes before the POSIX epoch that occurred due to something like metadata corruption on an old FAT12 floppy disk.)

@retep998
Copy link
Member

Even if most platforms support it fine, it's still unreasonable to have such massive filenames that really don't provide any tangible benefit to the user. About the only reason I can think of to not do this is because it would break links to affected doc pages, and I personally think it's still worth doing.

@lolbinarycat
Copy link
Contributor

To be fair, you could also argue that having type names that are that long is also unreasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants