-
Notifications
You must be signed in to change notification settings - Fork 50
Initial refactoring #119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Initial refactoring #119
Conversation
Thanks for getting this broken down! This is definitely the right size and scope Before doing a full review, I have a few higher level comments. While I'd normally say to each their own, I don't think the amount of shade thrown on python is (fully) warranted. Now I'm probably equally biased the other way here, but I think some of your judgements fly in the face of standard python conventions and will make this code harder to work on longer term.
|
@dihm In terms of the actionable part of this review I can put the MainWindow back in |
Regarding the other topics I don't want to start a philosophical war.
|
Update: I put |
And split "mostly gui" code into more files. This should pretty much complete the refactoring of what was |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying some things for me. I hadn't looked closely enough to realize how qapplication
was defined. I agree that isn't great. It's a good thing to fix here while we are moving things around anyway.
As for import lyse.__main__
, while I have little doubt documentation and behavior could be improved here at a language level, I still stipulate that it should never be done. So I wouldn't really classify unusual things happening when one does it as a bug and I don't think we should structure our code to protect against it. It shouldn't be done in the first place. In fact, I see the whole purpose of this PR is to ensure that action never has to happen.
This is relevant to the only real concern I have with the PR, namely that all these imports in __main__.py
are decoupled from their usage. It means tooling can't ensure imports are used/missing and therefore ensuring that import list is accurate is a purely manual process (lyse will still work fine even if we add extra imports or take needed ones off). I really don't like it.
I'd much rather move current main.py
stuff back into __main__.py
and continue to rely on the import side effect to handle the splash. I don't see us losing any functionality, and it conforms to standard conventions so we are less likely to surprise future developers. Unless I'm missing something else, the only thing the current implementation is solving is allowing import lyse.__main__
, but I don't think we should support that anyway.
Now, perhaps I am missing something. I am assuming that Lyse
and LyseMainWindow
do not need to be imported elsewhere for any reason in the future. My imagination may be limited, but I just don't see a valid reason for needing to do that that doesn't also entail a major structural change to how lyse works. And if that is the case, we should discuss it (ideally as another PR/Issue/etc, so this can move along).
Pretty sure my only other meaningful change request is some minor adjustments to the file structure. I think analysis.py
is kind-of vague given analsis_subprocess.py
exists. It also is only used in widgets.py
for use with in RoutineBox
(via an undeclared import, which is incidentally leading to a circular dependency since analysis.py
has to import RoutineBox
). I'd advocate for having a routines.py
file that has RoutineBox
and AnalysisRoutine
in it. I'd also advocate moving more of the QT widgets from filebox.py
into widgets.py
(basically any class that only has QT dependencies and no specific lyse logic baked in should move to widgets.py
).
Finally, I've noticed there are a few lingering unused imports in the files. Given that I'm asking for some changes here I won't list them all out, but an example is lyse.analysis
and qtutils.icons
in filebox.py
. Annoyingly, my tooling is not catching the lyse.analysis
unused import, but maybe that means something subtle is going on. Tread with caution I suppose.
…ere are some other places with interprocess / thread communication is defined and getting that together seems wise. This commit gets a place for it ready.
I am going to decouple some of these comments. First regarding the imports and SplashWindow. The SplashWindow is pure eyecandy that gives the user something to see while Lyse starts. Prior to this pull request pretty much everything that lyse was going to import was being also imported in |
Unfortunately
to make this behavior explicit and avoid the magical import. |
The last set of changes should fully resolve @dihm's points. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ispielma This is really good. Have two very small comments then I suspect it is ready to go.
I would like to stress test it a little more than the dummy shots on my home rig, but I won't have time until later this week. Hopefully that will give @philipstarkey sufficient time to look things over if he wants to weigh in. Otherwise, I think we'll be good to merge by Friday.
…ame_utilities as rangeindex_to_multiindex.
I also forgot, there is probably a little work that needs to be done on the documentation build to track the new changes. If you wanted to sort those out, I would appreciate it. If not, I'll get them sorted this week and make a PR to your branch with the changes. |
I am not that familiar with the documentation system (in terms of what is automatic and what is manual), but I will have a look.... OK so I have it generating the API in a way that is no worse than before, meaning that many functions / classes are not documented, but I think that this should be its own pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found another small issue testing the refactored code in the lab.
Thanks for sorting that out. Docs are definitely spotty, but at some level most people don't need to see the GUI docstrings anyway. In fact, we should probably consider splitting the |
…here and in dataframe_utilities.py to avoid circular dependencies, I moved these into utis.py and renamed them LYSE_PORT and LABCONFIG respectivly to denote their role as system wide constants. I also moved LYSE_PATH there as well for consistency, but re-exported it in __init__.py so it will still be accessible when lyse is imported.
In working with from lyse.dataframe_utilities import get_series_from_shot as _get_singleshot
from labscript_utils.dict_diff import dict_diff in this case when one does I strongly suggest that I amend this pull request to also create a file from lyse.lyse_api import ...
__all__ = [...] this will make explicit what is being exported and provide better control over the namespace, and the use of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for moving the API to another file, I'm not totally sold on the benefit. We should definitely define a __all__
, but we can just do that in place. The incidental imports still show up under import lyse
(like sys, os, etc), but I'm less concerned about that in that situation.
In any case, given the docs have recommended from lyse import *
for forever, this would be a breaking change, so we should save it for another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with where this. I'll give @philipstarkey a little more time to look it over before merging, now that the bugs have been sorted out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just reviewed the changes for now. I'm also hoping to find time to check out the new code and give it a look over to see if there is anything that stands out about the structure (which can be hard to see with all of the noise from the diff)
|
||
# lyse imports | ||
import lyse.dataframe_utilities | ||
import lyse.utils |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a little bit hesitant about this lyse.utils
import. lyse.utils
is importing Qt packages. But the lyse
package (e.g. this file) can be imported in either the GUI, or the worker process. We are probably tied to a Qt matplotlib backed in the worker process anyway? But it would be nice if we were not, or at least minimally so.
One solution would be to make a utils
dir, with three files (__init__.py
, gui.py
, worker.py
- or whatever names you like) so that the utils and imports can be split between only for gui, only for worker, or for both. That allows more specific imports to be made (note that the contents of the init file will be imported if either of the others are so that one should try to stay clean of Qt stuff as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had a go at implementing this and did end up settling on your suggested divisions. All said and done, I think there is a decent argument for moving dataframe_utilities
into the submodule or flattening the structure again and having utils
, gui_utils
and worker_utils
at the top level. I am leaning towards the latter, but I think there is a more fundamental issue to fix first (I'll comment the code elsewhere) before sorting out stylistic preferences like this one.
lyse.figure_manager.install() | ||
|
||
from matplotlib.backends.backend_qt5agg import NavigationToolbar2QT as NavigationToolbar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little bit nervous about relocating this import to before the figure manager is installed. Any idea if it has consequences?
There is a brief line in figure_manager.py about needing to patch matplotlib before importing pylab. I think maybe this import order dependency needs investigating a little bit more, and then documenting (or we revert the change of import location and log an issue to investigate it later)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think this is OK. The comment specifically calls out that the install needs to occur before importing matplotlib.pyplot
. I don't think there are issues with importing parallel submodules from elsewhere in matplotlib
.
splash.update_text('importing h5_lock and h5py') | ||
import labscript_utils.h5_lock | ||
import h5py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the h5_lock
import happening now? Until it's imported, anything that uses h5py could access h5 files without locking them, which could lead to corrupt files. Worse, if something imports h5py.File
explicitly (e.g. from h5py import File
) before h5_lock
is installed, then it won't ever get the patched version of h5py
. To be honest I'm wondering if it should happen even before numpy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the answer is the main GUI does not perform any direct accesses to h5 files. These accesses are provided by the main API, filebox, and dataframe utilities. The import is therefore handled within the core Lyse modules
portion.
I did, however, find an upatched h5py import in filebox that will be fixed.
# Start the web server: | ||
splash.update_text('starting analysis server') | ||
server = WebServer(app.port) | ||
splash.update_text('done') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The relocation of the WebServer
into the Lyse
class, and instantiating it before the GUI is set up, means that any messages received immediately on start could crash lyse when the WebServer.handler
method tries to access something about the UI that isn't instantiated yet.
The previous implementation worked such that messages could be received, and placed in appropriate event queues before the Qt loop was even started.
I suspect, given the auto-retry behaviour of BLACS, that there would be scenarios where the crash will occur (it's a bit of a race condition though so may not be easily replicable)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I understand the issue here. Would moving the start of the WebServer
to the end of the GUI init be sufficient? I believe that would recover the older behavior. I don't think we can go back to the old behavior exactly since it relied on globals-like behavior which is a bit difficult to reason about.
lyse/utils.py
Outdated
try: | ||
LABCONFIG = LabConfig(required_params={"ports": ["lyse"]}) | ||
LYSE_PORT = int(LABCONFIG.get('ports', 'lyse')) | ||
except Exception: | ||
LABCONFIG = None | ||
LYSE_PORT = 42519 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super comfortable with this. Instantiating a LabConfig
just because you import lyse.utils
has a bad code smell to it. It also isn't used by anything in this file. Seems like it was just move to fix some sort of circular dependency issue?
I'm sure there are a few possible solutions. One that may be the simplest is just to move the import of the labconfig inside the function that uses it, so that it can import it from lyse
. That's perfectly acceptable to do, and a good way to break circular dependencies if one part only needs something at run time, not load time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBF, the LABCONFIG
is used to define the LYSE_PORT
in this file.
In any case, I think the goal was to not hold multiple references to the LABCONFIG
instance, while needing different outputs from it in both __init__
and dataframe_utilities
. And given lyse.utils
is imported in __init__
anyway, I'm not sure the import side-effect is really all that meaningful.
Of course, we could extract LYSE_PORT
and integer_indexing
in a way that doesn't hold the reference to the labconfig at all, and even move that into their used locations (since they are only needed once each). But that feels a little overly defensive to me, especially considering that the labconfig is something that should be fairly universal. But maybe I'm missing the more subtle point here.
@ispielma I believe I have finally cleared off some time from my calendar, and this PR is well beyond over due to be finished and merged. If at all possible, I'd like to get it done in the next week or two. If that timing doesn't work for you, I can take over (by pushing to your branch or creating a new PR based on this one) and try to address Phil's review so this can finally be done. Let me know what your preferences are. |
@ispielma I have started working on some commits to address Phil's review. Are you OK with me pushing to your branch directly? I think it would be best to keep the work in a single PR (if possible, github rules for pushing to forks can be odd sometimes). |
@@ -55,13 +61,6 @@ | |||
# a flag to determine whether we should wait for the delay event | |||
_delay_flag = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These variables are causing quite a bit of grief for me. We have a decent number of places that need them across the modules, and I think it is generally considered a bad code smell to import from the module __init__
. Places these global-like values are used:
analysis_subprocess
_delay_flag
delay_event
spinning_top
: redefinition_updated_data
path
: redefinitionplots
: redefinitionPlot
: redefinition
figure_manager
spinning_top
utils.worker
: I've moved stuff here from__init__
on my working branch; I am slowly realizing that I don't really understand their purposespinning_top
:register_plot_class
,get_plot_class
, anddelay_results_return
_plot_classes
:register_plot_class
andget_plot_class
_delay_flag
:delay_results_return
__init__
path
spinning_top
inRun.save_result
_updated_data
inRun.save_result
I'm still trying to grok exactly how they are used and what their purpose is, as it isn't obvious to me why they are defined in this way. Most of them only appear to be necessary if you are using a script within the lyse GUI, and they wouldn't even be called with the plain API. I suspect I am unaware of some standard workflow. Do people commonly run their lyse scripts in the GUI, then also run them as stand-alone batch scripts unmodified (hence why register_plot_class
, get_plot_class
and delay_results_return
are defined in the API)?
Right now, I am inclined to move these global-like variables down to utils.worker
. Ideally I'd also like to consider other ways for the GUI to communicate to the analysis subprocesses that doesn't rely on passing through module variables. I'm probably not the man for that job, but I feel like there must be a better way. Perhaps manual injection into the AnalysisWorker.routine_module.__dict__
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spending a little more time with this, I don't see the purpose of lyse.Plot
or lyse.plots
. They do not appear to be accessed anywhere, and they are only set to something meaningful within AnalysisWorker.do_analysis
. I am fairly confident they can just be removed. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are part of the next step of my Lyse changes that seem to have leaked into this refactor. Their function is associated with making the lyse plot windows "first class" windows whose positions are when lyse shuts down. This is the highest priority Lyse request in my group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, though they appear to be on mainline already. Doesn't really matter where they came from I suppose, I'll plan on keeping them in.
Would it be possible to implement figure position tracking independent of lyse globals? Chunking out distinct functionality, where feasible, goes a long way in ensuring these reviews don't drag on into absurdity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I have not visited this in about a year, I will use my old code as reference, but rewrite the implementation based on the current state of Lyse.
OK, I've gone ahead and pushed minor edits directly to the refactor branch. I've also created a PR to Ian's branch (ispielma#1) that contains more disruptive changes. Hopefully it makes it easier to see/discuss what they are relative to these changes as they presently exist. |
This is a real pull request that adds no functionality. Instead it is a first step for any large-scale refactoring of the
lyse
code base. All that was really done was pulling out the functional code from__main__.py
intomain.py
. Even this is inherently useful because other files / modules can import from 'main.py', but owing to shamefully poor language design__main__.py
cannot be imported from. For may larger aggregate pull request, this is why I refactored.Doing this was slightly non trivial because
__main__.py
made liberal use of global variables that had to be transformed to method / function arguments and class variables.Note that this is not intended to be a complete refactor, but any next steps should be pretty easy with the global scope variables banished.
This was tested both on a Mac and a live laboratory deployment of labscript.
@philipstarkey and @dihm : Is this about the scope you are looking for in a more-easy-to-audit pull request?