Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ctime for since filter? #226

Closed
igoradamenko opened this issue Apr 7, 2017 · 13 comments · Fixed by #340
Closed

Use ctime for since filter? #226

igoradamenko opened this issue Apr 7, 2017 · 13 comments · Fixed by #340

Comments

@igoradamenko
Copy link

Hi!

When gulp.src searches for files with since option, vinyl-fs compares mtime of each file with since value (here). But what if a file wasn't changed?

When I just paste file to watched directory, its mtime does not change. And gulp.src doesn't see this file. But ctime changes. May be filterSince should compare ctime too?

@phated
Copy link
Member

phated commented Apr 18, 2017

If it wasn't changed, it should be passed through. I believe that's what the code is doing.

If it's not doing that, please send a PR with a failing test.

@phated phated closed this as completed Apr 18, 2017
@yocontra
Copy link
Member

yocontra commented Apr 18, 2017

It might be a good idea to do that if that is the case - this is more of a feature request than a bug, so I'll reopen and tag it as such. Want to try sending a PR? Should be quick.

@phated
Copy link
Member

phated commented Apr 18, 2017

It doesn't make sense to filter based on ctime because that value changes when you read the file...

@phated
Copy link
Member

phated commented Apr 18, 2017

@igoradamenko
Copy link
Author

@phated, I'm not sure about that. Look at this log:

$ touch test.txt
$ stat -x test.txt
  File: "test.txt"
  Size: 0            FileType: Regular File
  Mode: (0644/-rw-r--r--)         Uid: (  501/   dante)  Gid: (   20/   staff)
Device: 1,4   Inode: 21680940    Links: 1
Access: Tue Apr 18 23:21:23 2017
Modify: Tue Apr 18 23:21:23 2017
Change: Tue Apr 18 23:21:23 2017 # init value

$ cat test.txt
$ stat -x test.txt
  File: "test.txt"
  Size: 0            FileType: Regular File
  Mode: (0644/-rw-r--r--)         Uid: (  501/   dante)  Gid: (   20/   staff)
Device: 1,4   Inode: 21680940    Links: 1
Access: Tue Apr 18 23:21:38 2017
Modify: Tue Apr 18 23:21:23 2017
Change: Tue Apr 18 23:21:23 2017 # ctime wasn't changed

$ st test.txt # st opens file in sublime text (it reads the file, obviously)
$ st test.txt # open it again just for the test
$ stat -x test.txt
  File: "test.txt"
  Size: 0            FileType: Regular File
  Mode: (0644/-rw-r--r--)         Uid: (  501/   dante)  Gid: (   20/   staff)
Device: 1,4   Inode: 21680940    Links: 1
Access: Tue Apr 18 23:22:02 2017
Modify: Tue Apr 18 23:21:23 2017
Change: Tue Apr 18 23:21:23 2017 # ctime wasn't changed

@phated
Copy link
Member

phated commented Apr 18, 2017

@igoradamenko take a look at the node docs, it says read(2) which might not be used by the tools you are using. It's been a long time since I worked on the stat stuff but I believe I saw ctime matching atime when using node's fs.read stuff.

@yocontra
Copy link
Member

yocontra commented Apr 18, 2017

@phated based on the docs i don't think ctime should change on read, if it does that sounds like node really fucked up their implementation

why there's a difference between change/modify i'd have to ask stallman himself

@igoradamenko
Copy link
Author

igoradamenko commented Apr 18, 2017

@phated, I tried to read file using node:

$ touch test.txt
$ stat -x test.txt
  File: "test.txt"
  Size: 0            FileType: Regular File
  Mode: (0644/-rw-r--r--)         Uid: (  501/   dante)  Gid: (   20/   staff)
Device: 1,4   Inode: 21681105    Links: 1
Access: Tue Apr 18 23:31:22 2017
Modify: Tue Apr 18 23:31:22 2017
Change: Tue Apr 18 23:31:22 2017 # init value
$ nano index.js
$ cat index.js
const fs = require('fs');

fs.readFileSync(`${__dirname}/test.txt`); // 1

const fd = fs.openSync(`${__dirname}/test.txt`, 'r+');
const buf = new Buffer(512);

fs.read(fd, buf, 0, 64, null, () => {}); // 2
$ node index.js
$ stat -x test.txt
  File: "test.txt"
  Size: 0            FileType: Regular File
  Mode: (0644/-rw-r--r--)         Uid: (  501/   dante)  Gid: (   20/   staff)
Device: 1,4   Inode: 21681105    Links: 1
Access: Tue Apr 18 23:36:38 2017
Modify: Tue Apr 18 23:31:22 2017
Change: Tue Apr 18 23:31:22 2017 # same value

It looks like neither read (2) nor readFile (1) didn't change ctime.

@igoradamenko
Copy link
Author

Reading file on Windows does not change ctime also (I'm not completely sure, but it looks like so):

C:\Users\IEUser\test>copy NUL test.txt
        1 file(s) copied.

C:\Users\IEUser\test>dir /T:W test.txt
 Directory of C:\Users\IEUser\test

04/18/2017  11:45 PM                 0 test.txt
               1 File(s)              0 bytes
               0 Dir(s)  109,378,023,424 bytes free

C:\Users\IEUser\test>notepad.exe test.txt

C:\Users\IEUser\test>dir /T:W test.txt
 Directory of C:\Users\IEUser\test

04/18/2017  11:45 PM                 0 test.txt
               1 File(s)              0 bytes
               0 Dir(s)  109,377,835,008 bytes free

C:\Users\IEUser\test>

File was read in 11:50 PM, but last modified time is 11:45.

@contra, I'm not completely sure that I'm right about ctime, but I will send test PR as soon as I will be able.

@phated
Copy link
Member

phated commented Jan 2, 2019

We've had a similar report on the main gulp repo. I'm thinking we should revisit the "copy a file" use case.

My current thinking: we should check mtime first because that's the most likely thing to have changed. If that's not newer, we should check if the ctime is newer than the mtime and if so, use that. Maybe we can just always use the later of the 2?

@erikkemperman @igoradamenko thoughts?

@erikkemperman
Copy link
Member

I’m not sure about windows, but on *nix the ctime is about metadata changes (mode, permissions, owner) whereas mtime is about the contents of the file.

I’m not sure if people would expect a file to suddenly pass the since filter when they only change, say, a writable-bit on it?

Then again, they probably would expect the appearance of a new file in a directory to pass the filter, I agree, even if its mtime is too old.

Impossible to tell the difference though, just from the stats... And we’re not going to want keep track of directory contents and history.

So I guess it might be best to leave it up to the user, behind an option, if s/he wants to compare to mtime, ctime, or max(mtime, ctime)?

The default, if we had considered this scenario from the beginning, should have been the max I think, but perhaps now it ought to be mtime just because it’s currently what we’ve been doing thus far.

Note that relying on mtime was never full proof — if I overwrite a file with exactly the same data, its mtime changes even though its content hasn’t.

If the distinction is important you’d already be using hashes to detect changes (which would also pick up new files in the glob) and disregard the stats.

@igoradamenko
Copy link
Author

As I said in #228, I think it's fine to change checking mtime to ctime. So, I can recreate that PR.

@phated, I'm not sure that we need to check both mtime and ctime, because, as I understand, if mtime changes, ctime changes too.

The main disagreement between @erikkemperman's position and mine is the point, that I think that it's fine to trigger changes on files when their metadata changes. @erikkemperman thinks (as I understand) that it isn't OK in case of changing rights, etc.

I'm not sure, but maybe in this case, if we can't find a silver bullet, it's better to improve since and make it possible to pass a function to since? I mean, now we have this:

if (file.stat && file.stat.mtime <= since) {
  return callback();
}

...and we can change it to this:

if (file.stat && (typeof since === 'function' ? since(file.stat) : (file.stat.mtime <= since))) {
  return callback();
}

And in this case we just need to write a caveat in README.

@erikkemperman
Copy link
Member

erikkemperman commented Jan 3, 2019

Perhaps you’re right that metadata changes ought to count for passing the filter, I was just saying I wasn’t sure people would expect it to. But it’s fine, so long as it’s documented.

I believe you’re correct that in regular usage ctime will always be equal to or greater than mtime. I’m not sure about edge cases (can I make mtime > ctime by setting both explicitly?).

So I’d prefer max(mtime, ctime) just to be safe — it’s essentially free, relative to the cost of getting stats off disk in the first place.

Also, if your suggested change is where I think it would go, note that the resolution of function options will have already been done:

/~https://github.com/gulpjs/vinyl-fs/blob/master/lib/src/prepare.js#L9

By the way, I think there was an issue with ctime on Windows for ancient versions of node... I think we still support 0.10? If so, do we want to handle that separately in the code or simply document the difference?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants