-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add AutoPiecewiseLinFit class #95
base: master
Are you sure you want to change the base?
Conversation
for i, e in enumerate(xroot): | ||
self.bounds[i][0] = e.real-lb | ||
self.bounds[i][1] = e.real+ub |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the purpose of lb
and ub
to give some 'wiggle room' or 'slack' where break points are specified?
I think you need to make lb ==ub
and then default this to be some percentage of the total distance of x
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it is 'wiggle room' or 'slack' where breakpoints are specified. I agree with you to set lb==ub = some percentage of the total distance of x
Hi Yucheng, Thank you for this ambitious software contribution! I think there are many users that will find this really helpful. This is a functionality that has been requested dozens of times. The principle of restricting breakpoints to their own much smaller bounds will definitely make the optimization cheaper. This reduces the design space of possibilities by a lot. I think this PR is going to take a bit of work on both of us to get this in. Are you in a position where you can put more time into this? If so I think this will be an excellent contribution that will really help a lot of people. Questions/comments about your method
Overall design of pwlf automatic methodsIt's going to seem that I'm very picky here. Please feel free to disagree. Automatic pwlf methods should feel like a natural extension to pwlf. They should be simple and a user should be able to get the result of an automatic fit with one function call. I think we should use In terms of what goes into
On working with integers as breakpointsI've seen a number of people use these models with integers only. It turns out that integer optimization is not necessarily faster than floating point optimization. The ways I see to solve integer optimization problem:
Thanks again for this! I'm sure people will find it useful in it's current form. However, with a few modifications I think it can become even more powerful! Thanks, |
if use_bound == True: | ||
if lin_num is None: | ||
my_pwlf.fit(self.n, bounds=self.bounds, workers=-1, updating='deferred') | ||
else: | ||
my_pwlf.fit(lin_num, bounds=self.bounds, workers=-1, updating='deferred') | ||
else: | ||
if lin_num is None: | ||
my_pwlf.fit(self.n, workers=-1, updating='deferred') | ||
else: | ||
my_pwlf.fit(lin_num, workers=-1, updating='deferred') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of my_pwlf.fit()
try fit_guess(x_root, bounds=self.bounds)
. I suspect this will be good enough, and much faster than the even the parallel differential evolution!
I don't have a lot of intuition on how the polynomial roots are being used, and I'm still a bit confused here. Can you do some debugging steps, and show me how the polynomials are being used. I want to see the logic step by step.
coeff = np.polyfit(self.x_data, self.y_data, degree)
f = np.poly1d(coeff)
x_hat = np.linspace(self.x_data.min(), self.x_data.max(), 1000)
f_hat = f(x_hat)
# please plot(x_hat, f_hat)
xroot = list(filter(lambda x : x.imag == 0, xroot))
xroot = list(filter(lambda x : x >= self.x_data.min() and x <= self.x_data.max(), xroot)) I think this will help me give you hints onto how to select polynomial degree, and other parameters.
So consider the polynomial
I would try something like +- 5% of the total length. You can then make an option that the user can specify which percent to use. Try it out, if it doesn't work you can tweak the default a bit.
I put a comment into the code, but I think you can get away with using
Let's hold off on this. How pwlf works by default is a continuous optimization problem, so breakpoints can be any real number between
I'm cool with this! Let's also hold off on this, and do some tweaks to the method first. I'm pretty sure once we can quickly refactor to get this to elegantly implement into pwlf. |
Hi Charles, |
Sorry, there was some problem with my last commit. |
Hi Charles, |
I just wanted to reach out and say I have not forgot about this. Getting this in is something that I think will improve this library! I have just been completely swamped to help out here. I should have more free time in the next couple months (at least I hope so!). |
just want to tag along this PR if it's not too late. Wondering what are your thoughts? |
This is good feedback. Thank you @ffffxue I think if i had a go-to way of auto fitting these models I would do it based on the variance in your data. It would shoot for a particular signal to noise ratio, but it's just in my head and not implemented yet. |
Hi Charles
First Thanks for your contributions to this project!
I write a class to demonstrate my thoughts to approach how to determine the number of segments. Basically, I use polynomial to fit the graph first and then find the roots, then use those roots with your custom deviations to define bounds for speeding up convergence. However, I think convergence can be further speeded up by restricting breakpoints to int number. But I didn't finish this part so far.
Looking for your suggestion about my thoughts. you can email me ([email protected])
King regards,
Yucheng