Skip to content

[2.2] [WIP] [Finder] Adding native finders implementations #4061

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 37 commits into from

Conversation

jfsimon
Copy link
Contributor

@jfsimon jfsimon commented Apr 22, 2012

Work in progress...

Bug fix: no
Feature addition: no
Backwards compatibility break: no
Symfony2 tests pass: yes
Fixes the following tickets: #4031

This PR intends to add native finders implementation based on shell command execution.
Planned support concerns:

  • GNU find command -> done
  • MS FINDSTR command

*
* @return bool
*/
function isValid();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about isSupported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right... i keep it in mind for a future cleanup.

@fabpot
Copy link
Member

fabpot commented May 15, 2012

@jfsimon What's the status of this PR?

@jfsimon
Copy link
Contributor Author

jfsimon commented May 15, 2012

@fabpot 2 features missing for the GNU find adapter: sorting result with sort command and excluding directories; 1 bug (even if tests pass, which let me thing it needs more tests): regex matching is done on full path, not basename. Then I'll need to work on MS FINDSTR command adapter (I talked to Pierre Couzy, and he's OK to help when he will have time to). I'll try to push the sort and directory excluding features this week.

@jalliot
Copy link
Contributor

jalliot commented May 15, 2012

BTW @jfsimon, in the (quite specific) case where you don't precise filenames or other options but only contains or notContains, you could call grep directly without the find. That would speed things up a bit more :)

@fabpot
Copy link
Member

fabpot commented Jun 28, 2012

@jfsimon Would be nice to be able to include this PR before 2.1.0 beta2. Would you have time to finish the work soon?

function searchInDirectory($dir);

/**
* Tests adapter support for current plateform.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

platform... is it a typo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, fixed

@jfsimon
Copy link
Contributor Author

jfsimon commented Jun 29, 2012

@fabpot I'd say next week for GNU part with some help from @smaftoul.

public function end()
{
if (null === $this->parent) {
var_dump($this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe var_export inside exception instead if you really need it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is here for debugging, not for production... removed

@jfsimon
Copy link
Contributor Author

jfsimon commented Jul 10, 2012

It seems that I need to perform some benchmarks as find may not be so fast :/

@jfsimon
Copy link
Contributor Author

jfsimon commented Jul 10, 2012

@fabpot @stof do you think I can add benchmark scripts inside the component, or should I create a new repository for that?

@fabpot
Copy link
Member

fabpot commented Jul 10, 2012

Then benchmark scripts won't be part of the repository in the end, so you should create a new repo for that.

@jfsimon
Copy link
Contributor Author

jfsimon commented Jul 13, 2012

@fabpot @smaftoul Benchmark is ready (more cases to come): https://github.com/jfsimon/symfony-finder-benchmark
I'm glad to see that gnu_find adapter is really faster than the php one!

@stof
Copy link
Member

stof commented Jul 13, 2012

@jfsimon could you make a gist with the result of the benchmark ? I think many people will be lazy to run it themselves when looking at this ticket, and people using windows will probably be unable to run it at all :)

@jfsimon
Copy link
Contributor Author

jfsimon commented Jul 13, 2012

First results: https://gist.github.com/3107676

@jfsimon
Copy link
Contributor Author

jfsimon commented Aug 1, 2012

Sorry, I forgot [Finder] tag in 3 commits message... is it fixable?

@stof
Copy link
Member

stof commented Aug 1, 2012

@jfsimon you can edit the commit message whne doing an interactive rebase.

and btw, you will need to do a rebase anyway: the PR conflicts with master

}

/**
* Returns reegistered adapters ordered by priority without extra informations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two of them ^^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just see the double e

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

information without s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@jfsimon
Copy link
Contributor Author

jfsimon commented Aug 1, 2012

It seems that I'm an idiot, I used contains instead of name.
Real bench is here: https://gist.github.com/3230139
@fabpot sorry for the mess, I should go to bed :/
Results are still convincing!

@stof
Copy link
Member

stof commented Aug 1, 2012

They are, but the regex are not faster than glob anymore in these results

@travisbot
Copy link

This pull request fails (merged aa84e2d into b1618d2).

@jfsimon
Copy link
Contributor Author

jfsimon commented Aug 1, 2012

@travisbot you failed, not me!

@jfsimon
Copy link
Contributor Author

jfsimon commented Aug 1, 2012

Anyone to launch benchmark with php 5.4?
https://github.com/jfsimon/symfony-finder-benchmark

@lyrixx
Copy link
Member

lyrixx commented Aug 1, 2012

Bench with php 5.4.5
https://gist.github.com/3231244

@travisbot
Copy link

This pull request passes (merged 5fc01f2 into b1618d2).

@travisbot
Copy link

This pull request passes (merged ef0ebb8 into b1618d2).

@jfsimon
Copy link
Contributor Author

jfsimon commented Aug 3, 2012

@fabpot gnu_find adapter is now complete (IMHO). Should I start implementing windows one in this PR or wait for approval/merge and open another one for new adapter? It may takes a while...

@stof
Copy link
Member

stof commented Aug 3, 2012

As the Windows implementation could take a while, I would vote for a separate PR so that the GNU adapter can be merged without waiting the windows one.
But this PR will not be merged until the 2.1 stable release as it is scheduled for 2.2

@lyrixx
Copy link
Member

lyrixx commented Aug 3, 2012

👍 with @stof. There is no benefits to wait the windows adapter.

@jfsimon
Copy link
Contributor Author

jfsimon commented Oct 3, 2012

@fabpot Do you need something to be done before merging?

@fabpot
Copy link
Member

fabpot commented Oct 3, 2012

@jfsimon I hadn't had a look at this PR since a long time as it has "WIP" in the title. Can we talk about it tomorrow?

@jfsimon
Copy link
Contributor Author

jfsimon commented Oct 3, 2012

@fabpot with pleasure :)

@fabpot fabpot closed this in bb12295 Oct 4, 2012
@sun
Copy link
Contributor

sun commented Nov 7, 2012

Based on my findings, the actual cause for Finder's poor PHP adapter performance seems to be that it is using Iterators instead of RecursiveIterators. This causes the RecursiveIteratorIterator to perform a full (as in total) recursive filesystem scan without limitation, and Finder filters out unwanted items only afterwards.

By replacing the FilterIterators with RecursiveFilterIterator implementations, the total time for a limited filesystem scan can be easily reduced (cut in half, in my cases), especially when applying directory name or depth filters.

What I'm trying to say is — I'm not sure whether the abstraction into shell adapters is actually needed. It looks like the actual culprit are the iterator implementations, and fixing those could get the PHP adapter fairly close to the numbers of the shell commands — whereas it is easily possible that the additional abstraction into adapters unnecessarily cripples performance again.

Are there any plans to re-implement Finder with Recursive*Iterators?

(of course, that would be a major API break)

FWIW, here are the results of my investigation.

@fabpot
Copy link
Member

fabpot commented Nov 7, 2012

@sun: Thanks for the information. That's very interesting. IIUC, the major API break would be for the iterators, not for the public API (the one in the main Finder class), right? If that's the case, I'm +1 for the refactoring. Would you like to help on this refactoring?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.