This document summarizes what has been done from September 2010 to February 2011,
related to portal type classes, zodb property sheets, and accessor generation.
It should help getting a rough understanding of what portal type class is and
how to use it.
Table of Contents
Portal type class¶
What has been done?¶¶¶¶
- ERP5 objects are converted to portal type instances.
-
Portal type classes are instances of a class inside the module erp5.portal_type.
For example, a Person object should now have the erp5.portal_type.Person type.
-
The last part of the class path is invariably the portal type name. If you
want a erp5.portal_type.Banana class to be created, you need to create a
portal type (in Types Tool / portal.portal_types) named "Banana".
As a developer, what do I need to pay attention to?¶¶¶¶
You cannot create instances of documents directly:
container._setObject("some_id", MyUberDocumentClass("some_id"))
This use should be banned, and will break. Instead, you should use
container.newContent(portal_type="...", id="....").
You can experiment to understand the difference: the first use creates a Document
instance, the second used creates a portal type class. We only want portal type
classes.
All objects you create must have a portal type object defined in Types Tool.
No exception. Tools should have a portal type as well.
What needs to be done?¶¶¶¶
-
Migration: objects that have a classpath different than
ERP5Type.Document.XXX are not all migrated. migrateToPortalTypeClass tries
a simple, cheap heuristic to migrate at least the items just under the
portal, and the contents of Types Tool, but it's not perfect.
-
Candidates to non-migration are: items in ERP5Type.Core (Folder,
aka Modules, RoleInformaton, ActionInformation, ...) items in top level of
a product (ERP5Type.XMLMatrix ?) and generally speaking, all the "bastard"
classes that did not belong to a clean Document/ subfolder in your Product.
-
Small scripts can be written to recurse in a site and call
Base._migrateToPortalTypeClass on objects. But this needs to be done
carefully, maybe checking before that a portal type does exist for the
object you're migrating.
-
Note on auto-migration: be careful if implementing "migrate automatically"
things. The checks that you perform on instance bootup must be light. The
checks that you perform on ERP5Site.migrateToPortalTypeClass must be light
as well.
Hacking advanced notes¶¶¶¶
-
Note that Base_viewDict now prints along the class of the object, to allow
easy debugging/checks
-
Any doubt on a specific object? Checks its mro (method resolution order):
pp context.__class__.mro() in a pdb. The first class should be the portal_type
class (erp5.portal_type.Person), the second should be the Document class
(ERP5.Document.Person.Person)
-
Putting a __setstate__ in the Base class to attempt migration of all objects,
wherever they are, does not work. You might know that Zope enforces its very
own method resolution order, and as such, a method in Base is not guaranteed
to always overwrite the methods from Persistent.Persistent.
Property sheets¶
What has been done?¶¶¶¶
-
Creation of a new tool, Property Sheet Tool: portal.portal_property_sheets
that holds property sheets
-
This allows removing the old Property Sheets that we had on the filesystem,
and putting them in the ZODB instead.
As a developer, what do I need to pay attention to?¶¶¶¶
- Create new Property sheets as subobjects of Property Sheet Tool
-
You have four portal types depending on the Property you want to define:
Standard Property, Acquired Property, Category and Dynamic Category. The first
three should have explicit uses, the last one is meant to allow assigning base
categories to an object using a Tales expression to select categories.
-
When installing a Business Template, or with a new instance, you might need
to migrate your filesystem property sheets into ZODB objects. There is an
action on the Property Sheet Tool for that.
What needs to be done?¶¶¶¶
-
Validation of property sheets to avoid regenerating accessors on
invalid/incomplete changes. A few notes:
-
we can/should define constraints on properties to define how much we have
to fill to consider the property to be valid.
-
we could use validation workflow, but it needs exporting objects to
bootstrap/portal_*.xml with a "validated" state so that new instances can
directly load correct objects without thinking.
-
another way is to filter in interaction workflows: do not trigger accessor
regeneration is the property is inconsistent. On the other hand, the
solution is imperfect: save an incomplete property, restart your instance,
and you're doomed.
-
Better way to add new Properties. As the number of portal types/constraints
will grow, the "Action" menu will grow too long and it will be annoying.
Hacking advanced notes¶¶¶¶
-
The structure of property sheet tool itself is not complicated. But accessor
generation with property sheets is generally tricky, as it requires
bootstrapping an instance from nothing, knowing that we need to fetch most of
our information from ZODB. Chicken and eggs everywhere.
-
With our through the web era, we introduce new issues, and change the way
we're used to update our instances. We're used to somehow update products first,
then zodb. But when editing "vital" content in the zodb, it seems that the
point of focus for updates should be the zodb content first. Think about
someone changing a portal type or a property sheet, and then changing the
underlying document code: you need to update the ZODB/business templates
first; but to do so we need to have good enough products that allow restarting
an instance in an "old" state. We have no mechanism whatsoever to provide
this in ERP5; this is important to think about it or we (you) will run into
big troubles in the future. Upgrader approach is good for projects, but less
than ideal for developers, as we cannot write upgraders for each revisions,
nor can we write everywhere heavy code to think about "what if developer
updates like that"... Maybe portals need a kind of revision number attribute,
allowing us to tell easily how old is the core part of an instance, on an
extremely cheap manner: relying on template tool or any kind of complex
introspection is too slow to be acceptable.
Constraints¶
What has been done?¶¶¶¶
-
Constraints have always been part of Property Sheets. They now need to be
defined in Property Sheet Tool, just as any other Property.
-
But, if previously we had a classes in Constraint/ folder of products, we now
need to define a Document (and a portal type) for each constraint. In Property
Sheet view, you will see that several constraints can be added as a subobject:
there is one Document, and one Portal Type for each constraint type.
-
Constraints implement the Predicate API
As a developer, what do I need to pay attention to?¶¶¶¶
-
Projects need to rewrite their Constraint classes as Documents. See the
difference between ERP5Type.Constraint.PropertyExistence and ERP5Type.Core.PropertyExistenceConstraint
for an idea of the simple changes that have to be done. You need to derive
from constraint mixin, and override the _checkConsistency method
-
Business Templates that have ConstraintTemplateItem items should be rewritten,
to move each ConstraintTemplateItem as one DocumentTemplateItem (Constraint
Document class) and one new Portal Type
What needs to be done?¶¶¶¶
-
A few tests in trunk fail because they use Business Templates that have
ConstraintTemplateItem items (InventoryConstraint, notably, need to be adapted)
Hacking advanced notes¶¶¶¶
-
Performances of Predicate API for constraints needs to be assessed carefully.
-
Since Constraints are now Predicates, they are indexed in predicate SQL table:
some trunk tests had been failing because Predicate searchs were returning
constraints instead of (rules, other predicates... your pick). Your project's
tests might need to be refined.
Accessor generation¶
What has been done?¶¶¶¶
We're getting rid of Base._aq_dynamic, and accessors are now generated directly
from Property Sheet definitions, and put into Accessor Holder: one Accessor
Holder for each existing Property Sheet item.
Accessor Holder are classes, and you can see them in the method resolution order
of your ERP5 objects. For instance, for a person, person.__class__.mro() is:
(,
,
,
,
,
,
,
,
[...]
,
,
,
,
,
,
,
,
,
,
,
,
,
,
)
Each accessor_holder class corresponds to accessors that come directly from a
Property Sheet. Note as well accessor_holder.BaseAccessorHolder which contains
common methods such as related category getters and portal type group getters.
_aq_reset is gone as well
As a developer, what do I need to pay attention to?¶¶¶¶
-
mostly related to test writing: _aq_reset calls were replaced by compatible
resetDynamicDocuments calls. They are compatible, and do work, but are slow.
It's good for performance if you can check wether or not any of those
_aq_reset can be replaced by resetDynamicDocumentsAtTransactionBoundary that
delay the reset at the end of the transaction. See the section in hacker
notes for details.
What needs to be done?¶¶¶¶
-
cleanups, cleanups, cleanups. Utils.py and Base.py contains a lot of unused
code.
-
DocumentationHelper code is probably quite broken (has always been?) even
if tests do not reflect that
-
Two XML files are used to install sites, in ERP5/bootstrap. (portal_types and
portal_property_sheets). There is no easy way to export those or regenerate
them from an user point of view. What I did was writing a simple test fixture,
save a new fresh site, editing/adding the new portal types, and exporting
he XML. The problem is that we probably do not want to export ALL the content
of the tools, but only a restricted set of portal types/property sheets.
Hacking advanced notes¶¶¶¶
Performances¶
The effective tradeoff of this change is the following: we trade dynamic lazy
generation for static generation plus a few mro()-deep lookups. Check for
example the Base._edit code, where we have to lookup in a class mro() to fetch
the list of restricted methods. This kind of places where we have to walk one's
mro() are costly places. On the other hand, it's EASY to optimize them. With
lazy aq_dynamic, environment was constantly changing, and we had no guarantees
that everything was generated. But with portal type classes/accessor holders,
nothing ever changes once the class has been generated once: at the end of
loadClass() (ERP5.dynamic.lazy_class) nothing will ever happen to the class
anymore, data is "static". So it means that all deep computations we do can
be safely cached on the class object for later. Back to our _edit example,
the list of method ids that are restricted can be, and should probably be
computed once and stored on the portal type class
A new performance test can now be written. On a tiny instance, that only has
erp5_core:
portal.portal_types.resetDynamicDocuments()
for property_sheet in portal.portal_property_sheets.contentValues():
property_sheet.createAccessorHolder()
And time this loop. The impact of accessor generation is now easy to measure
and improve, instead of being a giant octopus with tentacles that unfold at
every dynamic call. Once you've looped over this list, you're mostly done,
and the rest of the code only gathers useful accessor holders and puts them
on the right classes. Cherry picking with workflow twists, as you still need
to wrap accessors as WorkflowMethod on the portal type class. We may start
with a relatively higher cost, but that's easier to improve, easier to profile,
easier to optimize.
If then, you want to assess the cost of Workflow method generation, you can do
something like:
portal.portal_types.resetDynamicDocuments()
for portal_type_id in portal.portal_types.objectIds():
getattr(erp5.portal_type, portal_type_id).loadClass()
And once again, time it.
Memory Cost¶
Generally speaking, we generate things less blindly, and after cleanups the
memory usage should drop to a lower figure than with aq_dynamic.
The globals in Utils are evil, and cache too much. I suppose that removing them
or emptying them WILL save a lot of memory. Similarly, I'm questioning the
validity and use of the workflow_method_registry attributes on portal
type/property holder/accessor holder classes
resetDynamicDocuments vs resetDynamicDocumentsAtTransactionBoundary¶
There was something relatively bad in the way we were using _aq_reset. Scenario:
# some_portal_type
portal_type.edit(type_class="Foo", type_base_category_list=["source",])
This edition triggers two workflow triggers, one for the class change, and one
for the base category change. Each trigger used to cause an _aq_reset call.
Generally speaking, if during one transaction we had N property changes on M
different objects, we would trigger N*M times _aq_reset. That begs the question:
is it absolutely compulsory to reset accessors immediately after one's action?
If we think about it, 100% of the actions that can trigger accessor regeneration
are user-triggered. Meaning that transactions will be short-lived, and that in
case of a success, a commit() will happen under a short time. So can we delay
the reset at commit time? Yes, it seemed so. It has a few nice properties:
-
if one edit triggers several workflow triggers, only one reset will happen.
-
In tests, if we do pay attention at what we're doing, we can group portal
types / accessor / base category setups and minimize the number of resets
Why did I care so much about the number of resets? With new accessor generation,
we do a bit more during generation; and especially the generation of basic
properties are very costly. So chaining several resets *is* costly, much more
than two aq_resets.
Related Articles¶