Skip to content

ENH: Port PVsyst and Desoto parameter estimation from MATLAB #708

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 148 commits into from
Aug 28, 2020

Conversation

mikofski
Copy link
Member

@mikofski mikofski commented May 4, 2019

pvlib python pull request guidelines

Thank you for your contribution to pvlib python! You may delete all of these instructions except for the list below.

You may submit a pull request with your code at any stage of completion.

The following items must be addressed before the code can be merged. Please don't hesitate to ask for help if you're unsure of how to accomplish any of the items below:

  • Closes issue porting pvl_PVsyst_parameter_estimation.m to python along with all associated codes to allow for full functionality #227
  • I am familiar with the contributing guidelines.
  • Fully tested. Added and/or modified tests to ensure correct behavior for all reasonable inputs. Tests (usually) must pass on the TravisCI and Appveyor testing services.
  • Updates entries to docs/sphinx/source/api.rst for API changes.
  • Adds description and name entries in the appropriate docs/sphinx/source/whatsnew file for all changes.
  • Code quality and style is sufficient. Passes LGTM and SticklerCI checks.
  • New code is fully documented. Includes sphinx/numpydoc compliant docstrings and comments in the code where necessary.
  • Pull request is nearly complete and ready for detailed review.

This is a continuation of #229 originally by @mattguttenberg in August, 2016

Sorry, something went wrong.

Matt Guttenberg and others added 7 commits May 4, 2019 01:55
from PVLIB_MATLAB by Matt Guttenberg Aug-2016, see pvlib#229 and pvlib#227

Added ported functions and tests to pvlib

Added ported versions of calc_theta_phi_exact, lambertw, singlediode,
update_io_known_n, update_rsh_fixed_pt and associated tests to pvl-python.
Also added a __init__ file ot the tests folder to describe the file folder
and allow the tests to be called if necessary. PVsyst_parameter_estimation
and singlediode are not complete versions

Changed the tests to properly call the appropriate functions

Finished the singlediode script

Added Test script for singlediode

Added tests for singlediode functions and updated scripts

Updated the lambertw and singlediode scripts to fix some small errors that
were found while testing. Added 38 tests for the singlediode script
functions althought more tests will be added.

Finished singlediode and corresponding tests

Finished adding tests for singlediode that tests all of the
functionalities of the script, including properly raising errors. Updated
the singlediode script to fix any errors found while testing. Added a
statement to the lambertw function to make the calculations slightly
faster in the event a -inf case.

Added a new file for the Schumaker_QSpline script

Finished Schumaker_QSpline script

Fixed Schumaker_QSpline script based on test

Adjusted all index references to be integer values to avoid the warnings
that were displayed. Fixed all logic errors. Fixed the sorting mechanism
for the final array.

Added test file for Schumaker_QSpline script

Added a Schumaker_QSpline test

Added a Schumaker_QSpline test (more to be added) and fixed the
singlediode_tests script to adhere to PEP8 standards

Added a couple more tests for Schumaker_QSpline

Added the est_single_diode_parameter port and fixed typos

Added the est_single_diode_parameter ported script to pvlib. Fixed a typo
in the Schumaker_QSpline script that caused an error under a very
particular circumstance. Added brackets in the Schumaker_QSpline test
script to get rid of syntax errors.

Added tests for the est_single_diode_param script

Delete __init__.py

Changed one test case for lambertw

Fixed some ambiguities in the imports

Updated the imports in many of the regular and test scripts to fix some
ambiguity issues that might have been occurring during testing

Started adding code to the PVsyst_parameter_estimation script

Added Classes to singlediode and PVsyst_Parameter_estimation

Added a class to move on with the code for the PVsyst_paramter_estimation
script and added a class to the singlediode script to wrap up the answers
in a more succinct format. Adjusted the singlediode tests to reflect this
change.

Ported over some of the code for PVsyst_parameter_estimation

Ported over some of the sections from PVsyst_parameter_estimation. Added
numdiff function to the PVsyst_parameter_estimation script. Since the
script uses matplotlib, added the requirement to the setup.

Finished PVsyst_parameter_estimation and adjusted outputs of functions

Finished porting PVsyst_parameter_estimation except for the section with
the robust_fit. Changed the outputs of this function and singlediode from
a class to a ordereddict and changed the inputs for
PVsyst_parameter_estimation from classes to ordereddicts as well.

Changed the setup to include scipy as a required toolbox

Added robust fit algorithm to PVsyst_parameter_estimation

Added the robust fit algorithm where it was needed to determine desired
information. Since this alrogithm used the statsmodels toolbox, the setup
was changed to reflect this addition.

Removed lambertw since scipy already had a working copy. Script Changes

Removed lambertw and associated test script since scipy.special already
had a working lambertw that has been tested and verified. Changed
associated scripts to reflect this change. Some bug fixes for
Pvsyst_paramter_estimation

Removed some coded scripts and updated PVsyst_parameter_estimation

Removed i_from_v, v_from_i and singlediode and used the already coded and
tested versions within pvlib. Updated PVsyst_parameter_estimation based on
some bug fixes.

More bug fixes

Fixed additional typos and other errors that have prevented the code from
completing. Now that the code runs to completion, additional checks will
be run to determine whether the converged values are appropriate.

Updated calc_theta_phi_exact to handle an edge case

modified calc_theta_phi_exact to handle the edge case where nnsvth was
equal to 0.

Brought back lambertw script for use in update_rsh_fixed_pt

Reintroduced lambertw script for use in update_rsh_fixed_pt since the
lambertw script was having problems for cases where rsh was extremely high
in calc_theta_phi_exact

A couple mroe PVsyst_parameter_estimation bug fixes

Fixed import errors and removed unused lines

Fixed an import error in update_io_known_n and removed v_from_i from the
tests since the code is using the code that already exists and has been
tested in pvsystem. Removed an unused import in update_rsh_fixed_pt

Fixed typos / errors

Fixed an error in PVsyst_parameter_estimation in the numdiff function
where the summations were not occuring properly. Fixed a typo in pvsystem
where a parameter was accidentally changed.

Final bug fixes for PVsyst_parameter_estimation

Fixed a typo in one of the plots

Style Update

Modified all written scripts to adhere to the flake8 and pep8-naming
guidelines. Updated tests as suggested by Will

Documentation Update

Updated script documentation for each of the written scripts to fit better
with the pvlib documentation

Added Documents from Cliff

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
- to estimate_parameters() since it was the same as the module
- update imports and calls everywhere
- other modules still have functions shadowing them, like
pvsyst_parametre_estimation()

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@mikofski
Copy link
Member Author

@cwhanse curious, do you know if the Schumaker quadratic spline parameters could be replaced by scipy.interpolate.BSpline? If not would you be interested in contributing your code to Scipy, or if you're not then could I contribute it?

@cwhanse
Copy link
Member

cwhanse commented May 10, 2019

@mikofski Bspline isn't a replace for the Schumaker spline. Schumaker's method constrains each spline to be convex down (negative quadratic coefficient) - this is important to mimic the expected shape of the IV curve.

I'd say yes to pushing to scipy, but we should reconsider how I handled the ends of the data. Schumaker's paper wasn't clear on that point.

@cwhanse cwhanse mentioned this pull request Jun 11, 2019
@cwhanse cwhanse added this to the 0.7.0 milestone Jun 18, 2019
@wholmgren wholmgren modified the milestones: 0.7.0, 0.8.0 Nov 26, 2019
- move ivtools.py to ivtools/__init__.py, all tests PASS
- fix Schmaker_QSpline again, boolean indexing in Numpy is different
 than MATLAB, size must be the same, so get just the first (n-1) indices
 first
- fix est_single_diode_param.py which uses np.tile, only works with
 integers, not floats, so remove periods from 1. to make it an int
- rename calc_theta_phi_exact and lambertw to be prefixed with test so that
 pytest actually runs them, both PASS
- start fixing test_PVsyst_parameter_estimation, FAIL
- in calc_theta_phi_exactly, replace if-then with
 np.where(nnsvth is zero, use inf, or calculate)
- restrict arrays to dtype=float where they're created from arrays, this
 prevents arrays of arrays with dtype=object, which FAILS with
 np.isnan(doesn't work on dtype=object, only float)
 TODO: fix this!
- in lambertw b/c any(must be an iterable) is called on z which is
 scalar when called from PVsyst_parameter_estimation then need to make z
at least a 1d array, even tho this causes problems later, oh well
 TODO: fix this!
- in pvsyst_parameter_estimation, replace enumerate with zip(),
 and more importantly in rectify_IV_curve() remove any nan's which were
 added to the ivcurves
- the old tests are still PASSING, but test-pvsyst is a WIP
- ignore w503,w504 break before/after binary operator
- add logging to pvsyst_parameter_estimation
- add latex $math$ to graphics output
- replace repeated len(f) with n
- use concise indexing notation in numdiff
- add test data from original MATLAB, based on pvsyst demo
- more concise indices in numdiff
- add verbosity to least_squares
- limit to 100 ivcurves
- the last 500 points seem to cause a problem that I need more time to pin down
- instead of testing against the matlab generated pvsyst coefficients, which
 seem to be different from `pvlib.pvsystem.calcparams` works, also test the
 actual iv curve points (isc, voc, imp, vmp, pmp) generate from the coefficients in
 pvlib with the original data
- set tolerances to match the test data
- suffix with _tests instead of prefix
- remove matplotlib from pvsyst_parameter_estimation, I save all of the
 snippets so maybe they can go in the tests folder?
- removes graphic & convergeparamsfig args from check_converge and the
 graphic arg from pvsyst_parameter_estimation funcs
- safely import scipy and statsmodels inside pvsyst_parameter_estimation so
 bare linux tests can run on CI, add comment at top
- also remove mpl from setup optional requires
- add pvsyst_parameter_estimation to api docs
- raise runtime error if doesn't converge instead of the oflag
- remove oflag arg from pvsyst_param_est everywhere
- add decorator for requires_statsmodels in conftest and apply to
 test_pvsyst_parameter_estimation
- rename Schmaker_QSpline and est_single_diode_param tests

Signed-off-by: Mark Mikofski <[email protected]>
- remove TODO's to delete old code replaced by np.where in
 calc_theta_phi_exact
- stickler fixes:
  * overindented lines and continutation characters in calc_theta
  * long line in docstring of estimate_single_diode
  * over indentation in pvsyst_param_est
  * extra line after function check_converge before fun_rsh
- also add TODO to change convergence relative ratios
@cwhanse cwhanse mentioned this pull request Aug 26, 2020
8 tasks
@cwhanse
Copy link
Member

cwhanse commented Aug 26, 2020

Also, many of the convergence failures and parameter checks are untested: https://codecov.io/gh/pvlib/pvlib-python/compare/3183cc9149b7a7d8d42fa707295160a9d98873f2...5bd14c7c41f991ba59694bb464cd273c5a611533/diff

It's actually hard to compose IV curve data that incur some of those failures. I don't know if it's worth the time to test those lines, if we feel it is needed I'll hunt through some measured data and see if I can find some that fail by each criteria.

@adriesse
Copy link
Member

Sorry, I started looking through, but it's way too much for me. I'd really need to try using the code before venturing to make the most minor comment.

@cwhanse
Copy link
Member

cwhanse commented Aug 26, 2020

@kanderso-nrel what do you make of the two failed tests? No actual test failures but a mysterious error code from Cmd.exe... I have tried re-running the failed tests on previous commits, sometimes the error clears, but for the py35 windows build it is persistent.

@kandersolar
Copy link
Member

@kanderso-nrel what do you make of the two failed tests? No actual test failures but a mysterious error code from Cmd.exe... I have tried re-running the failed tests on previous commits, sometimes the error clears, but for the py35 windows build it is persistent.

Not 100% sure but it looks like something in the shapely/geos world doesn't like getting garbage collected when the tests are done. I think it's fine to ignore for this PR and worth digging into further if it continues to be a nuisance. Could be that we need to change how we install shapely.

Copy link
Member

@wholmgren wholmgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very quick read...

EPS = np.finfo('float').eps**(1/3)


constants = OrderedDict()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why an OrderedDict instead of regular dict?

Comment on lines 23 to 26
def numdiff(x, f):
"""
Compute first and second order derivative using possibly unequally
spaced data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MetPy implements 3-point formulae for n-dimensional data, where the pvlib function uses 5-point formulae only for a function of a single variable. I don't see a way for pvlib to use the MetPy function, because the polynomial expressions for the weight at each point don't extend such that a 5point function can use a 3point function in it's calculation. Way back when I was working out this method there was a reason to use 5-point formulation, probably for some level of smoothing of the input data, but I don't recall exactly.

The MetPy function implement some logic to handle the partial windows at each end of the data, where the pvlib function ducks the issue and returns Nan. That could be an improvement to bring from MetPy to pvlib. I don't think it would substantially improve the downstream function that currently uses the output of this numdiff function.

I'm sure the pvlib function coding could be improved - it's very explicit. The MetPy function has sophistication that isn't needed for a function of a single variable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @cwhanse, @wholmgren , & others,

Not a blocker to me, but curious if you had considered other numerical differentiation tools like:

  • scipy.misc.derivative - allows you to change the number of points used (from 3 to 5, etc) as long as it's odd - check the source to see the weights used
  • statsmodels.tools.numdiff - also does complex numbers and can return the gradient, Jacobians, and Hessians
  • numdifftools - latest release was June, 2019, but this is based on the MATLAB derivest originally authored by legendary Kodak mathematician John D'errico
  • algopy - I believe there used to be more packages like this, but Algopy seems to be the only one (in Python) that has survived. It requires significant effort, and overloads NumPy, but I believe it calculates derivatives of each operation as it occurs rather than of functional outputs

Anyway, just putting this out there, I don't actually expect you (or MetPy) to change anything, just always curious how or why there are so many tools that seem to all do the same thing, in apparent contradiction to pep20: the Zen of Python:

There should be one-- and preferably only one --obvious way to do it.

Copy link
Member

@cwhanse cwhanse Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mikofski . I would much prefer to import this capability from numpy or scipy than host it in pvlib. Maybe it's out there in python-land, when we wrote the Matlab code we couldn't find it. Looking at the resources you identified:

  • scipy.misc.derivative requires equally spaced x-data, which is not a safe assumption for measured IV curves.
  • statsmodels.tools.numdiff uses hardcoded 2 point (forward difference) or 3 point (centered difference) formula. It might accommodate unequal data spacing, not clear to me. Even so, see the early remark about the smoothing achieved by a higher order formula.
  • numdifftools looks like the primary use case is automated differentiation, which is related but different than calculating a numerical derivative from data which is the need for the sdm.fit_xxx_sandia functions.

MetPy's numdiff is based on the same reference I used (handles unequal data spacing). The function could be made more general by adding higher order formulas, then imported here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking Cliff. Let's go with this implementation.

Do we really want to make this part of the public API? Same concern for the qspline function.

Comment on lines 169 to 170
if len(voltage) != len(current):
raise ValueError('voltage and current must have the same length')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally -1 on this kind of check in pvlib. vstack will throw its own informative enough ValueError if they don't match.

Copy link
Member

@cwhanse cwhanse Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hold over from Matlab, doesn't appear it was tested either

y = y.flatten()

n = x.size
assert n == y.size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert calls don't belong in code. again -1 on this kind of checking.


@requires_scipy
@requires_statsmodels
def test_fit_pvsyst_sandia(npts=3000):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this test be broken up into much smaller chunks?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a long test function but I don't see the advantage of breaking it up. There's only one test with many outputs being checked.

The test produces 1) Pvsyst parameters (9 values), 2) five diode equation values for each IV curve, and 3) a Boolean array which is True if the parameter procedure converged for an IV curve. Each of these outputs is tested. Also, the estimated Pvsyst parameters are pushed forward through the Pvsyst model and the calculated IV curves are compared with the input IV curves.

Half of the code is extracting variables from the text files containing the input IV curves and the known solution. We could separate the IO portion into a helper function but it would only be used here.

Long term, if pvlib adopts an explicit convention for IV curve data, there would be code in pvlib.iotools to read/write that data, and the text files used here would conform to that convention.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a helper function or two would make the test more readable but fine to leave as is.

@wholmgren
Copy link
Member

I've seen those errors on a lot of the recent PRs. I'm hoping they magically go away with a future update to something. Python 3.5 is nearing end of life and Python 3.9 is coming soon, so we should probably drop 3.5 soon anyways.

Copy link
Member

@wholmgren wholmgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cwhanse.

We'll need a whats new entry too.

@@ -90,6 +90,16 @@ def assert_frame_equal(left, right, **kwargs):
requires_scipy = pytest.mark.skipif(not has_scipy, reason='requires scipy')


try:
import statsmodels.api as sm # noqa: F401
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import statsmodels.api as sm # noqa: F401
import statsmodels # noqa: F401

Comment on lines 14 to 18
BASEDIR = os.path.dirname(__file__)
TESTDIR = os.path.dirname(BASEDIR)
PROJDIR = os.path.dirname(TESTDIR)
DATADIR = os.path.join(PROJDIR, 'data')
TESTDATA = os.path.join(DATADIR, 'ivtools_numdiff.dat')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer pathlib.Path for new code

'I_L_ref', 'I_o_ref', 'EgRef', 'R_sh_ref', 'R_sh_0', 'R_sh_exp', 'R_s',
'gamma_ref', 'mu_gamma']
varlist = ['iph', 'io', 'rs', 'rsh', 'u']
pvsyst = OrderedDict(key=(paramlist + varlist))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is ordereddict needed here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope


@requires_scipy
@requires_statsmodels
def test_fit_pvsyst_sandia(npts=3000):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a helper function or two would make the test more readable but fine to leave as is.

Comment on lines 8 to 12
BASEDIR = os.path.dirname(__file__)
TESTDIR = os.path.dirname(BASEDIR)
PROJDIR = os.path.dirname(TESTDIR)
DATADIR = os.path.join(PROJDIR, 'data')
TESTDATA = os.path.join(DATADIR, 'ivtools_numdiff.dat')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pathlib here too

@@ -0,0 +1,52 @@
4.50204,0,nan,nan
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why .dat when the format is csv?

@@ -0,0 +1,3588 @@
36,0.0054,-0.0774,Mitsubishi PV-UE125MF5N cSi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.csv here too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And moved both files to pvlib\data

Comment on lines 23 to 26
def numdiff(x, f):
"""
Compute first and second order derivative using possibly unequally
spaced data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking Cliff. Let's go with this implementation.

Do we really want to make this part of the public API? Same concern for the qspline function.

@cwhanse
Copy link
Member

cwhanse commented Aug 28, 2020

Odd that I can't reply to a few of the comments.

  • I added helper functions for the test that read the data files, maybe sometime we can replace those custom functions with a pvlib function.
  • numdiff and schumaker_qspline are private.

Copy link
Member

@wholmgren wholmgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok to merge when the tests pass.

@wholmgren
Copy link
Member

@cwhanse the 3.8 configuration also needs statsmodels https://github.com/pvlib/pvlib-python/blob/master/ci/requirements-py38.yml

@cwhanse
Copy link
Member

cwhanse commented Aug 28, 2020

Feels good to finally merge this. Thanks @mikofski for moving this forward, @Peque for the improvements, and especially @mattguttenberg for the port from Matlab - hope you are still watching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants