Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APL from CG trajectories (Martini FF) #61

Closed
sizeineb opened this issue Feb 22, 2020 · 1 comment
Closed

APL from CG trajectories (Martini FF) #61

sizeineb opened this issue Feb 22, 2020 · 1 comment
Labels

Comments

@sizeineb
Copy link

Dears,

I have a coarse-grained trajectory (Martini Force Field) with a protein embedded in a bilayer with 6 different types of lipids. I would like to compute the Area per Lipid for each class, in each leaflet separately as a function of time.

I am new in PyBILT and I am having a 'beginner' problem.

I construct the analyzer:
analyzer = BilayerAnalyzer(structure='npt.tpr',
trajectory='traj.xtc',
selection='resname POPC')

Next, I added the type of analysis that I need:
analyzer.add_analysis('apl_grid apl_1')

Now, I am a little bit confused about how to run the analysis and retrieve the output.
I did this:
output = analyzer.get_analysis_data('apl_1')
but the output is an empty dictionary.

Is it possible to have some help here, please? Let's say I would like to compute the APL of POPC in time. What is the correct way to do it using PyBILT? I tried to follow jupyter_notebooks examples but nothing is working for me.

NB: I am using a tpr file as a structure file because my system is Coarse-Grained and the masses /charges are mentioned here and not on a regular file (PDB or gro).

Thank you in advance for your help.

@blakeaw
Copy link
Collaborator

blakeaw commented Feb 24, 2020

Hi @sizeineb ,

It looks like you are missing the call to the analyzer.run_analysis() function before you get the output, which is why are you getting the empty dictionary. So it would be:

analyzer.run_analysis()
output =  analyzer.get_analysis_data('apl_1')

Another note is that the initial selection needs to include all the membrane lipids. The 'apl_grid' analysis will use the auto leaflet segmentation provided by the bilayer_analyzer and will segregate the lipid types using the lipid RESNAME, so different lipid types will be automatically separated which is why you need to provide them all in the initial selection. The output from 'apl_grid' is a dictionary keyed by lipid resname. Each element in the dictionary is a NumPy array with the length being the number of frames analyzed. For each frame there are four columns with index 0=>time, 1=>instantaneous area per lipid, 2=>running time average of area per lipid, and 3=>standard deviation of the running average.

However, the area per lipid analysis doesn't explicitly support membrane-embedded proteins yet, so the area per lipid estimates you get might be inaccurate or biased due to space occupied by the protein; the addition of this functionality has been in my to do list for a while, Issue #52, but I never have gotten around to it.

Here's a possible option you could try if the protein spans the bilayer:
I'm not sure if or how well this would work, but you could try including all the protein residues that are directly embedded in the membrane as a part of your 'membrane' selection. As long as each one has a unique RESID the bilayer_analyzer will segment and include them in its list of lipids. Then the apl_grid analysis should include them and segment their average areas per lipid (and protein residues in this case) by RESNAME; essentially each protein residue would be treated as lipid for the purposes of estimating the area per lipid and in this way you account for the area occupied in the lateral plane by protein residues.

If you want to observe the lateral area segmentation you can output the lipid grids to a PDB file. To access them you have to iterate over the bilayer_analyzer rather than calling the run_analysis() function. For example (instead of calling analyzer.run_analysis() function):

# look at the first five frames
analyzer.set_frame_range(start=0, end=4, interval=1)
for _frame in analyzer:
    # get the frame number
    fs = "frame_{:03d}".format(analyzer.reps['com_frame'].number)
    analyzer.reps['lipid_grid'].write_pdb("lipid_grid_"+fs+".pdb")

Then you could load the lipid grid PDBs alongside those frames from the trajectory to get a visual idea of how well it is accounting for the protein area.

Hope this helps! Let know if you have other questions or need further assistance. And be sure to let me know how it goes if you try the workaround above to account for the protein area.

@blakeaw blakeaw closed this as completed May 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants