HYXT Blog

we produce valuable software for K12.
clock January 4, 2010 15:05 by author bwang

Referrers http://www.infoq.com/news/2009/12/agile-project-delays

A delay, in general, is getting something done later than it was scheduled for thereby causing distress and inconvenience. Likewise, delay is considered to be a waste in the Agile terminology. In an Agile project, a delay causes discontinuity and thereby causes other wastes like relearning, task switching etc.

Jack Milunsky, attributed some of the common delays to

  • Project approvals - waiting for projects to get approved with developers sitting around thus leading to waste of time and money.
  • Waiting for a proper prioritized list of requirements - so that work can get started.
  • Waiting for resources to become available – this is usually a sign to introspect if the organization is taking too much work.
  • Change approval processes – this is a wasteful process itself. If this happens too often then it might be easier to reduce the sprint length.
  • Increases in work-in-progress - the more work-in-process, the more developers have to wait before they can deploy their code to production.
  • Delays getting client to sign-off on acceptance tests – this is true not only for sign-off but also getting client time for resolving requirement issues, give feedback on demos etc.

Jack mentioned that there are many delays in between the sprints too. The team should identify and eradicate the delays by putting in some hard work. He suggested,

You have to ensure that the backlog is properly groomed. So you need an effective PO who understands the market, the client etc. You need well written stories. You need estimates from developers early so the PO can make decisions ahead of the planning meeting. It's all about designing delays out of the system so that there are smooth hand-offs at all the transition points. And it's worth mapping this end-to-end process and identifying delays at each of these points.

Likewise Wouter Baars, mentioned top causes of delays in IT projects. Some of them include,

  • Gold Plating – when a team spends too much time on enhancing a functionality which has not been requested by the client.
  • Neglecting quality control - time pressure can sometimes cause programmers or project teams to be tempted to skip testing. This frequently causes more delays than it prevents.
  • Working on too many projects at the same time – Task switching leads to more problems than solutions
  • The ‘one-solution-fits-all’ syndrome – trying to fit an existing solution to any new problem.
  • Mediocre personnel – technical or process insufficiency causes delays at multiple levels.
  • Customers fail to fulfill agreements - when customers do not react in a timely manner to areas in which they must be involved, projects can come to a standstill.
  • Tension between customers and developers - If the project is not proceeding well enough, the tension can cause additional delays. It disturbs the feeling of trust and the working atmosphere.

Another interesting reason for delays was suggested by Robert Neri when he pointed out that the difference of Agile adoption within an enterprise might also cause delays. He mentioned,

One of the things we often encounter is that Support organizations cannot move as fast as the project sprints and tend to delay Agile projects. Similarly, non-Agile projects have a difficult time addressing the integrations with Agile projects.

Thus, if your Agile project is getting delayed try to map the reasons to one of the common causes of delays. Once you identify the cause, it would be wise to start working on it to resolve it immediately. This would reduce one of the biggest waste in the project.

 


clock September 16, 2009 23:58 by author Sky Jia (贾超)

clock April 14, 2009 15:48 by author Sky Jia (贾超)

作者 Shane Hastie译者 金明 发布于 2009年4月12日 下午7时32分

在这个经济动荡的年代,越来越多的组织选择拥抱敏捷开发作为自己的生存战略。这吸引了大量学者对组织内团队迈向成功所该具备的态度和特征进行研究。业务敏捷(“察觉变化,并高效地响应变化”的能力)是非常重要的,但如何才能达到这种敏捷?

从纷繁冗杂的资料里面只挑选三个主题,我们可以发现价值观、激励以及极限面试(帮助挑选出合适的人选)的重要性。

价值观和道德观

Michele Sliger(The Software Project Manager's Bridge to Agility一书的合著者)认为敏捷是关于业务经营的道德观,通过关注如下八项道德准则从而使组织达到成功:

  1. 承诺 只做与交付业务价值相关的事情
  2. 专注 只做能交付业务价值的事情
  3. 开放 诚实展示项目的真实状态
  4. 沟通 与每个人交谈,及时回答问题,帮助团队成员协调工作
  5. 简单 目标明确,以最小代价交付最大价值,尽早交付价值
  6. 反馈 通过利益相关者的反馈让团队专注于交付中的价值
  7. 勇气 敢于作出决定,在价值交付逾越底线时敢于说不
  8. 尊重 尊重每个人以及直接团队之外的利益相关者,理解我们构建的产品的所有者,关心他们的需求

(细心的读者很容易发现以上这些正是极限编程的价值观,并且很好地契合了敏捷宣言的价值观和原则。)

餐馆老板的激励

Enthiosys上有一篇题为“厨师和敏捷的餐馆老板”的时讯比较了敏捷开发和餐饮的异同,进而讨论了业务敏捷的必要性。这篇文章提出了很多有用的对比,敏捷团队从中能大受裨益:

  • 只有在客户购买和使用我们的解决方案的时候才能创造利润,而不是在我们发布(解决方案)的时候。厨师拥有华丽的菜单并不是成功;只有人们进来点餐,才能算成功
  • 发布并不等同于利润。没协调好的厨房上错了菜式的顺序,只会让顾客不爽;只有准确地上菜才能让付钱的食客高兴,赢得回头客
  • 协调一致的发布能更快赢得利润。厨房诸项如果行云流水,则我们可以更快地摆放桌子,更快地赚到更多的钱。

挑选合适的人

如何挑选具备合适特征的人?CIO 杂志采访了两个(这方面的)先行者,他们应用“极限面试”过滤候选人,发现那些持有敏捷态度的候选人。他们关注于(候选人的)合作、创造性探索、学习态度以及团队技巧。

这种面试流程要求严格,可能会吓跑一些申请者,但它保证了加入团队的人适合团队并且拥有适当的技巧,这些对于团队的成功是非常必要的。个体胜于过程,而且雇佣合适的人提供了商业成功的最佳基础。

世上并无存在点金之术,可以保证在这个动荡时期生存下来并且获得成功。但越来越多的商业组织认识到敏捷态度和实践提供了一个框架让他们可以寄寓希望,以及可以快速响应不断变化的市场需求的工具。

查看英文原文http://www.infoq.com/news/2009/03/Achieving-Agility


clock April 14, 2009 13:54 by author Sky Jia (贾超)

According to the Cambridge Dictionary an apprentice is "someone who works for an expert to learn a particular skill or job". Merriam Webster says: "one who is learning by practical experience under skilled workers a trade, art, or calling". Uncle Bob Martin recently wrote about his experience with apprentices and what he considers key to progressing from apprentice to journeyman.

He describes two hypothetical apprentices: Sam, a developer who has apprenticed with the same master and had the same year fifteen years in a row. The other, Jasmine has changed jobs (and therefore masters) a number of times - growing her skills along the way. The following diagram illustrates the difference in their progress.

Bob’s point is that Sam, who has never changed masters, will always be a student and his growth is limited. Whereas Jasmine, who’s path has been varied,  really is a journeyman – travelling from master to master learning new things from each. Eventually Jasmine herself can become a master.

One commenter JMiller suggests that with a large enough company that you don’t need to leave your employer to change masters companies the size of Google or Microsoft, etc.

Corey Haines points out that while there are companies that are large enough to support Journeyman tours inside the company, none that he knows encourage it.

From her experience at Tektronix, Rebecca Wirfs-Brock remarks: “To me, moving around in the same company is roughly equivalent to changing employers, especially if the company is big enough…and I did several job shifts in my 13 years at Tektronix.”

Corey Haines is starting to have some ideas about how one transitions from apprentice to journeyman:

During the apprentice phase, a person is busy learning. They are practicing specific techniques, rigorously applying rules and procedures. Over time, having been influenced by many mentors, an apprentice starts to develop their own toolbox, the set of practices that they systematically apply. These practices form a basis for further development, a core that an apprentice can build upon.

Paul reports that in the UK companies use a similar approach hiring and training mechanical apprentices. After 6-12 months the apprenticeship is complete and people often move on somewhere else in the industry. Even though the company may not retain that person they benefit as they have a larger pool of well trained people to hire from in the future.


clock March 12, 2009 18:21 by author Sky Jia (贾超)

作者 章昱恒 发布于 200935上午1230

数据迁移是指在系统软件开发中,将具有实际业务价值的数据,依据功能需求或系统开发的要求,在不同存储媒介、存储形式或计算机系统之间转移的过程。

数据迁移是系统开发经常涉及到的一项工作。在企业级应用系统中,新系统的开发,新旧系统的升级换代,以及正常的系统维护,不可避免地涉及到大量的迁移工作。而在一个以数据为核心的业务系统中,数据的迁移更是无处不在。比如:在以数据仓库为架构原型的系统设计中,ETL(抽取,转换,装载)部分的实现就是一种数据迁移;对大型数据系统的分布式实施,数据迁移就是整个实施过程的主要部分。而在敏捷实践中,渐进式的数据库开发,更是涉及到大量的数据迁移和同步工作。

我们时常会听到用户提出这样的要求"我们并不过于关心应用的好坏,但需务必保证数据准确"。的确,在以数据为运营基础的行业里,数据质量本身就是软件质量的权重部分,尤其在电信、金融和控制领域里,这一特征表现的格外明显。数据迁移也是敏捷开发中相当重要的环节,它影响着各个发布版本的数据质量,而数据质量又决定着系统的有效性和可靠性,因此高质量地完成数据迁移不容忽视。

数据迁移往往被视为一件很简单的工作。在很多人眼里,数据迁移仅仅是用sql语句向相应数据表装载数据的过程。但在实际操作中,数据迁移涉及到很多层面的因素,如用户需求,系统功能,数据库建模等,若出现问题,将导致开发进展缓慢或质量不高。常见问题有业务系统逻辑模糊、脏数据、遗留系统的技术债和管理债等。那么如何有效的避免这些问题,提高迁移质量呢?

本文将以ThoughtWorks中国公司与客户合作的CRM项目为背景,为读者介绍如何在敏捷开发中高质量地处理数据迁移工作,从而在数据层面提高系统质量。

开发背景

A系统(旧系统)是客户原有的一套CRM(客户关系管理)系统。系统采用B/S 架构,使用sql server 2005做为后台数据库。旧系统的数据建模设计采用了高度范式化的设计思路,其目的是极度追求灵活性。业务数据被大量拆分并散布存储在上百张数据表里。数据表内和表之间不存在参照约束。大量的业务逻辑采用存储过程封装以提高效率。存储过程体系相当庞大,且存在复杂的相互调用。数据库中存在一些脏数据,可能是长期的使用、维护或误操作导致,但没人知道它们有多少,具体存在哪里。应用界面可用性不理想而且系统效率较低,用户常抱怨系统反映迟缓或无反应。数据库存储的业务数据约50G左右。

ThoughtWorks 团队将为客户提供一套新的CRM系统用以替换旧系统主要功能。新系统精简整理旧系统功能,并整合了客户的最新需求。在设计上做了巨大变更,以改善界面可用性,同时为了保障终端用户对系统服务的需要,新旧系统要求能够同时运行并实现数据同步,当终端用户全部过度到新系统后,终止旧系统。在这个过程中,DBA 团队需给予足够的数据保障。

以下为项目版本的发布图。

数据迁移开发方法

1. DBA需要制定目标并且管理自己的任务

尽管在每个迭代中,团队都会讨论决定如何组织'需求故事'story),但是DBA仍然需要有自己的'故事墙'story wall),并且花时间组织自己的story。在实际开发中,数据迁移仅仅是DBA工作的一部分,DBA还要完成相应的story开发和数据分析,有时还要给开发人员提供数据支持。混乱的管理会带来开发上的冲突。因此,有效管理任务是做好数据迁移的首要环节。

故事墙是管理这些任务的最好方法。尽管这个故事墙对客户提供的商业价值是间接的,但从整个团队角度来看,任何需要数据的人或程序都是DBA的用户,故事墙有利于管理每个story包含的数据需求,避免数据迁移任务与其它数据库开发任务之间的冲突,从而减少重复性工作或修复性工作。DBA有必要将这种方法引入到数据库开发中。

DBA要从商业价值角度决策数据迁移的需求。系统开发中,客户和开发员常常会向DBA提出自己的数据迁移要求,但往往这些要求并不具有全局性和决定性,毕竟他们仅仅是针对一个story的需要而提出。如果DBA盲目执行,将起到事倍功半的效果。DBA应当积极参加IPM(迭代计划会议。它是在每个迭代开始时的会议,全体成员共同讨论story计划完成数量)。无论是直接与用户交互,还是参与团队合作,DBA有必要将每个story内容了解的清清楚楚。通常,DBA可以不必像开发人员一样去了解story的开发细节,但通过与业务分析师和开发员的沟通,潜在的数据需求自然浮出水面。针对这些数据需求,通过再次组织并加以优先级,我们很容易回答这些问题:接下来应该完成的任务是什么?它的实际商业价值是什么?谁将需要它?什么时候需要?实践证明,多花些时间和团队或客户沟通是事半功倍的好方法,而且DBA通过了解业务数据可以给开发员更好的指导,减少开发员对数据的误解,有利于提高整体团队的开发效率。

通过对每个story的了解,我们总结并制定了针对当前发布版本需要的7个数据迁移story,并且确认了它们的确不存在任务上的重复,也邀请项目经理和客户一起确认了这份计划。如此我们的目标已经制定。

2. 思考实施策略

我们已经管理好所有数据迁移的任务,接下来考虑如何实现。通过以往的经验,我们发现如果没有仔细思考全局和细节问题而直接编写代码,带来的后果是无法控制的。我们应该首先充分了解这个过程可能存在的风险,然后决定采用什么样的策略,是否可以借助工具提高效率。这里的潜在风险主要包括:

2.1 数据质量

旧系统的数据库建模是一个高度范式化的结构,每个表之间存在相当大的依赖关系。一旦一个表存在脏数据,我们如何保证得到正确的查询结果?

2.2 对原有系统的了解

旧系统的应用程序引入了面向对象的设计方法,并且继承关系数据也被存储在若干张数据表里,如何正确区分这些业务对象和关系,保证在迁移过程中不会制造脏数据?

2.3 业务数据映射

旧系统和新系统之间存在着相当大的业务逻辑差异,我们是否能够将业务逻辑、数据映射到新系统?是否存在不可实现的转换?

在未充分了解这些问题之前,我们无法进一步制定计划,即时给予客户反馈是解决这些问题的最好方法。经过进一步沟通后,我们发现问题的复杂程度远远超过想象,尽管客户对旧系统非常了解,但他们对于某些数据也不能给出明确答案。鉴于这些情况,我们制定了初步的解决策略:

  1. 更多的了解旧系统,即时给予反馈。对于那些无法找到答案的问题,考虑是否可以寻求其它资源或忽略没有价值的数据。
  2. 尽量细化分割每一个复杂需求,形成多个任务。小粒度任务能够帮助暴露更多问题。
  3. 采用测试驱动,确保一套可靠的测试机制。
  4. 制定实现框架和阶段性目标。
  5. 不要过于乐观的估计进展,每一阶段要留有充分的单元测试。
  6. 调整每个迭代的内容,对有较强依赖关系的任务可以放在今后的迭代周期里。

3. 实施数据迁移

新系统的数据迁移包含两个部分:一次性数据迁移和数据同步迁移

一次性数据迁移

一次性数据迁移指仅仅发生在某一个发布版本上线安装时,新旧系统同时处于脱机状态,一部分数据将从旧系统中转移到新系统的过程。

数据同步迁移

数据同步迁移过程发生在新系统运行时,新旧系统同时处于工作状态,双方通过交换数据保证彼此数据的一致性。

同为数据迁移,但因两类迁移各具特点,因此在共同的处理方式上也略有不同。

  

一次性数据迁移

数据同步迁移

特点

  1. 数据量大。
  2. 使用频率低(一次性使用)。
  3. 转换逻辑复杂,需大量定制映射转换数据。

  

  1. 数据量小
  2. 使用频率高(以分钟为单位,周期性运行)。
  3. 转换逻辑复杂,少量定制映射转换数据。
  4. 需要事务处理以保证数据一致性

共同处理方式

  1. 细化任务。
  2. 测试驱动。
  3. 持续集成

不同处理方式

  1. 在执行测试驱动中,应侧重数据质量的测试。应依据不同环境的测试结果,增强测试体系。
  2. 工具选择。避免使用第三方工具,直接使用sql脚本以提高迁移效率。
  3. 保留中间处理结果
  1. 在执行测试驱动中,应侧重逻辑映射方面的测试。
  2. 工具选择。可考虑使用第三方工具,增强事务控制。
  3. 可不保留中间结果
  1. 细化任务

依据最初制定的开发策略,当我们遇到复杂的迁移需求时,首先分解每个需求为若干个模块,然后画出整体结构图。以下是某一处数据迁移脚本的模块分割:

最初由于这个部分的迁移逻辑过于复杂,以至于客户对它的处理结果没有信心。但当共同完成这个图表后,大家一致认为它没有像想象中的困难。总而言之,立刻解决一个复杂的问题很困难,但解决其中一个小问题却很容易。

  1. 测试驱动

如同编写程序代码一样,我们不仅为实现数据迁移脚本使用了测试驱动,还引入了针对数据库设计的一些方法。在程序设计中,当代码本身结构良好,单元(类、方法)之间关系清晰,可以直接添加单元测试。现在,我们有了很好的脚本逻辑结构,可以很容易添加每一步结果的单元测试,这就如同形成了一道安全网,保证异常数据出现时,能够立即发现并加以处理。在实际编写迁移脚本之前,应首先明确测试内容,准备好测试脚本。

测试内容包括:

  • 应产生的符合期望的数据

基于给予的原始测试数据,这一测试过程测试脚本的数据转换逻辑是否正确。以下举例说明:

测试环境:旧系统中存在某个名为'Jason'的客户信息,他的personId 1000101

测试目的:当某一客户的信息迁移到新系统的CUSTOMERS表后,新系统应该存在该客户信息。

新系统上要运行的测试代码:

DECLARE @personName NVARCHAR(250),

  

SELECT

@personName = personName

FROM

CUSTOMERS

WHERE

personId = 1000101

 

IF (@personName <> 'Jason') or (@personName is NULL)

BEGIN

INSERT INTO LoadTestErrorLog (errorDescription)

VALUES ('personName for personId 1000101 is not Jason')

END

Go

 

这里常用的原则是:一段sql语句仅用来测试一处期望数据,这样可以减少代码之间的相互依赖性,更准确的定位错误数据。

  • 不应当产生的异常数据

异常数据指在迁移过程中出现的不符合逻辑的数据。理论上讲,迁移过程不应当出现异常数据,然而现实情况中,迁移结果总会出现我们不需要的数据。其原因包括数据源出现异常、实现过程中的误操作、系统应用的bug等。总而言之,为了保证这些错误不会出现在最终结果,相应的测试脚本必不可少,也是防止问题进一步扩大的有效举措。这一测试过程常被用来发现在生产环境中可能出现的问题。以下举例如何测试异常数据:

测试环境:全部或部分生产环境数据

测试目的:将某个客户的信息迁移到新系统的CUSTOMERS表后,数据表不应该具有顾客名字为空的记录,如果出现将视为迁移过程的错误。

新系统上要运行的测试代码:

DECLARE @isExistPersonNameWithNULL INTEGER

  

SELECT

@isExistPersonNameWithNULL = count(*)

FROM

CUSTOMERS

where personName is null

  

IF (@isExistPersonNameWithNULL> 0)

BEGIN

INSERT INTO LoadTestErrorLog (errorDescription)

VALUES ('personName doesn't contain legal information')

END

  

Go

  • 数据表的数据量是否符合期望

当数据被迁移至新系统后,应当确保迁移数据量符合应期望值。实现方法多种多样,较简单的方法是直接比较数据迁移前后的数据记录数是否在数值上相等。以下举例说明:

测试环境:全部或部分生产环境数据。

测试目的:客户数据被迁移后,应当确保客户数据没有丢失。

新系统上要运行的测试代码:

DECLARE @NumberofCustomerinOldDB INTEGER

DECLARE @NumberofCustomerinNewDB INTEGER

  

SELECT

@NumberofCustomerinOldDB = count(*)

FROM

oldDB.dbo.persons -- 这是在旧系统中定义的客户表

...

--省略复杂的过滤逻辑

SELECT

@NumberofCustomerinNewDB = count(*)

FROM

newdb.dbo.CUSTOMERS -- 这是在新系统中定义的客户表

where personName is null 

 

IF (@NumberofCustomerinOldDB<>@NumberofCustomerinNewDB )

BEGIN

INSERT INTO LoadTestErrorLog (errorDescription)

VALUES ('not all customers are migrated ')

END

Go

 

最终当把测试sql代码片段组装在一起后,我们获得了一批测试脚本,并按照以下流程,通过使用NANT工具实现自动化:

NANT中的实现方法:

<target name="-init " … />

该任务负责初始化测试环境

<target name="-parseDbScripts " … />

该任务负责编译并部署迁移脚本

<target name="-resetTestData " … />

该任务负责重置测试数据

<target name="-executeMigrationScripts " … />

该任务负责执行迁移脚本

<target name="-testMigration " … />

该任务负责执行迁移测试脚本

<target name="testDataMigration" depends="-init, -parseDbScripts, -resetTestData, executeMigrationScripts, -testMigration" />

该任务将成为持续集成调用的入口

  1. 持续集成

为完成持续集成测试,测试沙盒必不可少。"沙盒"是一个完整的功能环境,在这里脚本能够被编译,测试和运行。

  • 在开发沙盒中,我们准备了少量的核心数据,用以测试sql脚本的质量。
  • 在系统级集成测试沙盒中,我们还准备了一个小型数据库,这个数据库包含了一部分核心数据,着重测试数据迁移过程的逻辑转换。
  • 在生产环境级测试沙盒中,由于数据库来源于实际数据备份,因此数据处于不断变化状态,这就更需要不断运行测试脚本,避免脏数据和数据丢失。由于生产环境数 据量相对大了许多,我们可以适当减少测试次数以减少对开发资源的消耗。同时,其它测试脚本,如变更数据库结构的脚本,都可以和数据迁移脚本组织在一起,一 次性完成测试。

    同样,我们采用自动化机制维护这些开发测试沙盒。

    将测试置于持续集成环境中,下图是处于持续集成环境的测试任务。

  1. 工具选择

选择数据迁移工具应当以帮助提高工作效率和数据迁移运行效率为原则。通常最直接的方法是编写sql脚本,借助其它工具也能起到很好的效果,比如MS SSIS等。然而我们发现,过多的引入第三方工具往往带来的麻烦也多,例如,我们不得不花时间来学习这些工具的某些特殊用法,有时工具也会产生bug,以至于不得不再花时间解决这些bug,而这与最初的开发目标相背离。因此,有效的方法是尽量使用sql脚本执行所有的迁移工作,同时也得到了最佳的执行效率。

  1. 保留中间结果用于脚本调试

相比设计语言,Sql语句较难调试,即使有些数据库产品提供了调试工具,但是调试数据结果集仍然是项挑战性的工作。尤其在旧系统到新系统的迁移过程中,业务逻辑发生巨大变化,客户经常要求提供某些证据,来解释他们对数据迁移结果的怀疑。保留中间环节数据,不仅方便调试,也方便数据追溯,为开发带来更高效率。以下举例说明:

SELECT

...

into debug_allpersonhistroy

FROM

oldDB.dbo.personhistory -- 这是在旧系统中定义的业务存储表

...

--省略复杂的过滤逻辑

select column1...columnN

into debug_allpersonhistroy_aftermapping --保留这一步数据集合

from debug_allpersonhistroy inner join mappingtableBtwOldandNew

...

--省略复杂的过滤逻辑

SELECT

...

FROM

newdb.dbo.contactHistory -- 这是在新系统中定义的业务存储表

...

--省略复杂的过滤逻辑

Go

典型问题

数据迁移在不同的场景往往出现不同的问题,单凭经验也不能全部解决。运用头脑风暴,集中团队中所有力量思考所有可能出现的问题并加以避免。有时开发员遇到的问题也帮助DBA少走弯路。最终,头脑风暴能够提供我们的是一份有价值的列表,里面包含各种问题和注意事项:

  1. 一致性检查

一致性检查包括:字符编码检查、语言设置、环境参数设置等。

迁移过程常出问题的是字符集,它带来的问题是数据乱码。不同系统在最初设计时应用的字符集或编码格式未必相同。在迁移过程中,单凭缺省设置是不够不安全的。有效的办法是在项目伊始,即确认系统间环境一致性。在新系统中采用兼容性的unicode编码也能够解决这些问题。

  1. 控制NULL的使用

由于旧系统本身很少使用约束,以至于在表连接查询中出现大量无法得到正确匹配的数据。在 sql中,当我们试图使用自然连接,我们发现某些数据丢失了,如果使用外连接,这将会带来一种新的脏数据:NULL。从数据库设计角度,NULL不代表任何含义,而实际情况中,很多数据库建模往往给NULL赋予含义,甚至多种含义,以至于不同的查询需求要视不同的业务逻辑对待。在旧系统里,这种现象比比皆是,无疑给迁移带来了不少麻烦。

解决方法:不为NULL赋予逻辑上的定义。尽量少使用外连接运算。

例如:

旧系统定义如下父子结构表:

objectId, parentObjectId,objectType …

------------------------------------

Null Null 'root'

1 Null 'contactManager'

2 1 'contact'

3 1 'contact'

4 Null 'orgnisation1'

 

显然,系统希望构建如下对象树图:

然而,当程序试图遍历所有对象时发现:NULL无法参与计算。因为NULL与任何数据的计算结果都是NULL。程序必须增加额外代码来处理特殊情况。

  1. 代码复用,降低依赖性

迁移脚本应当遵循与编码同样的规则,高内聚,低耦合,能够被重复利用的代码需尽量被封装成单元,重复拷贝并不是迁移脚本应当采用的方法。

解决方法:使用临时存储过程实现某些公用代码的复用,简化调用接口。

  1. 新问题,新测试

当我们遇到新的问题时,常忙于解决问题,给出解释。然而当这一切完成后,并不意味着问题已经全部被解决。因为这些问题仍然可能再次发生,也说明目前测试不足。

解决方法:当新问题出现后,暂停当前的工作,立刻针对这种情况写出测试。为其花费些时间意味着不会让技术问题债台高垒。

例如:在新系统的数据库里,QA发现了一组不符合逻辑的数据:记录的结束时间(EndTimestamp)早于开始时间(startTimeStamp)8个小时。它的实际期望结果是:记录的结束时间必须晚于开始时间。

ID startTimestamp, EndTimestamp, createDate …

-----------------------------------------------------

11020011 2008-12-14 09:23:00 2008-12-14 01:23:00 2008-12-14 09:23:00

 

显然程序在插入数据时用错了时区。在bug被修复之前,立刻加入一个数据库测试以保障今后不会再次出现。

测试代码如下:

DECLARE @CNT INTEGER

  

select @CNT=COUNT(*) from tableA where startTimestamp> EndTimestamp

IF @CNT>0

BEGIN

INSERT INTO LoadTestErrorLog (errorDescription)

VALUES (' EndTimestamp should be late than startTimestamp ')

END

GO

  1. 目标制定者和开发者应该保留的心态

数据迁移是一件看似简单但具有挑战的工作。因此,我们常常过于乐观估计开发效率。然而这里的风险在于我们仅仅看到了处理逻辑,而没有看清楚数据质量,以至于盲目写出的迁移脚本可以在测试环境中工作,但无法在生产环境中运行。

解决方法:无论多么简单的数据迁移,应首先与客户或业务分析师沟通业务逻辑,确保对数据质量的了解。

结论

数据迁移是一项看似简单却蕴含巨大挑战的工作。它不仅包含了具体技术问题,而且要求DBA具有较好的沟通能力,深入的了解业务逻辑。通过旧系统到新系统的数据迁移工作,我们逐渐地将精益软件设计思想深入到细节,并且取得了很好的效果。当数据迁移完成后,我们完成了近6000行的迁移脚本,迁移结果通过了客户方的抽样测试,最终确保了整个系统的正常运行。


clock March 12, 2009 13:26 by author ethan

Team Foundation Server (commonly abbreviated TFS) is a Microsoft offering for source control, data collection, reporting, and project tracking, and is intended for collaborative software development projects. It is available either as stand-alone software, or as the server side back end platform for Visual Studio Team System (VSTS).

Contents

Architecture


Team Foundation Server 3-tier architecture

Team Foundation Server works in a three-tier architecture: the client tier, the application tier and the data tier. The client tier is used for creating and managing projects and accessing the items that are stored and managed for a project. TFS does not include any user interface for this tier, rather it exposes web services which client applications can use to integrate TFS functionality with themselves. These web services are used by applications like Visual Studio Team System to use TFS as data storage back end or dedicated TFS management applications like the included Team Foundation Client. The web services are in the application layer. The application layer also includes a web portal and a document repository facilitated by Windows SharePoint Services. The web portal, called the Team Project Portal, acts as the central point of communication for projects managed by TFS. The document repository is used for both project items and the revisions tracked, as well as for aggregated data and generated reports. The data layer, essentially a SQL Server 2005 Standard Edition installation, provides the persistent data storage services for the document repository. The data tier and application tier can exist on different physical or virtual servers as well, provided they are running Windows Server 2003 or better. The data tier is not exposed to the client tier, only the application tier is.

Most activity in Team Foundation Server revolves around a "work item". Work items are a single unit of work which needs to be completed. In many respects they are similar to a "bug" item in bug tracking systems such as Bugzilla, in that a work item has fields to define Area, Iteration, Assignee, Reported By, a history, file attachments, and any number of other attributes. Work items themselves can be of several different types, such as a Bug, a Task, a Quality of Service Assessment, a Scenario, and so forth. The framework chosen for any given project in a Team Foundation Server defines what types of work items are available and what attributes each type of work item contains. These items are internally stored in XML format, and their schema can be customized to add other attributes to different items, or create new items on a per-project basis. Each work item has associated control policies which control who is allowed to access and/or change the items. It also includes notification and logging capabilities to log all the creation, access or change events (controlled by policies) and optionally notify certain users when certain events occur.

Any given Team Foundation Server contains one or more Team Projects, which consists of Visual Studio solutions, configuration files for Team Build and Team Load Test Agents, and a single SharePoint repository containing the pertinent documents for the project. A team project contains the user defined work items, source branches, and reports that are to be managed by TFS. TFS provides capabilities for managing these projects. When creating a project, a software development framework must be chosen, and cannot be changed afterwards. TFS includes several templates for the most common ones, including agile and formal methodologies. Choosing the framework populates the project with predefined items such as project roles and permissions, as well as other documents like project roadmap, document templates, and report definitions. These items can be then linked to work items as well. The status of certain elements of the project can be set to automatically update as work items are updated. TFS can integrate with Microsoft Excel for the creation and tracking of project items. The status of the items can be created and edited in Excel and the resulting spreadsheet document can be submitted to TFS, which will import the data into its project management feature. It can also integrate with Microsoft Project as the project management front end. The project items can also be exported as Excel documents for further analysis of the data.

TFS does not natively include a UI for performing these tasks. The capabilities are exposed via web services, which are then used by client applications like Visual Studio Team System IDE. However, TFS does include a Team Foundation Client (TFC) application which can be used to perform these tasks outside of the VSTS IDE. TFC also operates by invoking the same web services. TFS exposes a client API that can be used by client applications to access the functionality; the API itself manages proxies to communicate with the web services as well as client side caching to reduce latency. The WSDL descriptions of the web services are also provided, in case an application wants to directly call the web services. Visual Studio Team System Web Access, available as an add-on, also addresses this.

Source control

Team Foundation Server provides a source control repository, called Team Foundation Version Control (TFVC). Unlike Microsoft's previous source control offering, Visual SourceSafe (VSS), which relied on a file-based storage mechanism, Team Foundation source control stores all code, as well as a record of all changes and current check-outs in a SQL Server database. It supports features such as multiple simultaneous check-outs, conflict resolution, shelving and unshelving (shelving is a way to save a set of pending changes without committing them to source control, while still making them available to other users), branching and merging, and the ability to set security levels on any level of a source tree, alongside the most visible features of document versioning, locking, rollback, and atomic commits. The source control mechanism integrates with Team System's work items as well; when a check-in (termed "changeset") occurs, a developer can choose to have his code associated with one or more specific work items, to indicate that the check-in works towards solving specific issues. TFS administrators can enforce check-in policies that require Code Analysis requirements to have passed, as well as to enforce the association of check-ins with work items, or update the state of associated work items (like flagging a bug as "fixed" when checking in code that has the bug fixed). Individual versions of files can be assigned labels, and all files with the same label forms a release group. Unlike VSS, TFS source control repository does not support linking to an item from multiple places in the source folder structure, nor does it allow an item to be "pinned" (allow different references to the same file from different directories to point to different versions in a way that cannot be further edited).

TFVC supports branching at entire source code level as well as individual files and directory levels as well, with each branch being maintained individually. Multiple branches can be merged together, with the built in conflict resolution algorithm merging the changes between two branches of the same file where it can automatically reconcile the differences or flagging them for manual inspection if it cannot. Merge can be performed at "changeset" level as well, instead of the branch level. A successful merge is automatically checked out in the source control repository.

TFVC is not limited to source code only, but using the Windows SharePoint Services infrastructure it is built on, it provides a version-controlled library for other documents in the project as well, including project plans, requirements and feature analysis documents among others. All documents in the source controlled repository can be linked with any work item, and access to them can be controlled by defining access policies.

Reporting

Reporting is another major component of Team Foundation Server. Using the combined data for work items, changesets, and information provided by Team Build and results from Test Agents, a variety of reports can be created. For example, the rate of code change over time, lists of bugs that don't have test cases, regressions on previously passing tests, and so on. The reports are built using SQL Server Reporting Services, and can be exported in several different formats, including Excel, XML, PDF, and TIFF. Reports can be accessed both through Visual Studio, as well as through the web portal.

TFS uses its logging framework for automated data collection as well. The logging infrastructure monitors and logs information regarding access and use of the work items and source code, which can then be used by the analysis services to find trends. TFS includes a warehouse adapter in the data tier, which caches data from the underlying normalized database in a form suitable for analytics - in fact tables and dimension tables. SQL Server Analysis Services are then used to analyze this data, and reports created. Reports can span multiple work items including bug trends, code churning, build trends amongst others. Other analysis applications can also use the data directly pulled off the web services.

Project portal

On a per-project basis, TFS also creates a SharePoint site for the project, which can be used to track the progress of the project as well as to explore the work items and source controlled documents in the project, which are presented via the document library. It can also be used to view the reports generated. As a communication medium, the users associated with each other can use it to communicate amongst each other. The comments can be linked to various items as well. For each project, depending on the project properties, TFS uses a predefined template that defines the appearance of the site. These templates can be customized by the TFS administrators.

Shared services

TFS provides a handful of services that can be used for integration with other applications like IDEs and Project Management Systems. The linking service allows loosely coupled relationships to be created between items, for example a bug item and the source code revision(s) it applies to. The security services allows creation of security groups from users, to which access rights are then assigned. The classification service allows definition of policies to automatically classify items based on a multitude of criteria and the eventing service allows any component to raise an event and a notification action assigned to the event. The notification can be either using feed syndication or e-mail, or invoking other web service.

Team build

Team Build is a build server included with Team Foundation Server that can be installed on almost any machine that can support Visual Studio. Machines configured with Team Build can be used by developers to do a complete build of the most recent versions of the software contained in source control. Records of every build, whether it succeeds or fails, are kept so that developers and build administrators can keep track of the progress of the project. If a build succeeds, it analyses what changes have been made to in source control since the last successful build, and updates any work items to indicate that progress has been made. For example, if a tester filed a bug work-item against build #15, and a developer checked in a change just prior to build #18 being created, then the bug work-item would be updated to state that the bug has been fixed. A tester can then confirm or deny that the bug has been resolved.

Currently there are two versions of TeamBuild, each version matched to a TFS installation version. It is also highly customizable.

TFSBuild.proj is the file which drives a TeamBuild. The Team Build Language is synonymous with the msbuild language.

References

·         Team Foundation Server: At Work

·         Visual Studio 2005 Team System: Enterprise-Class Source Control

·         Using Source Code Control in Team Foundation

·         Team Foundation Server Fundamentals: A Look at the Capabilities and Architecture

·         Visual Studio Team System 2008 Web Access

See also

External links


clock January 20, 2009 09:46 by author Sky Jia (贾超)

Traditionally, software release is considered to be a handshake between engineering and business. Engineering passes on the tested code to business, which in turn promotes it to the market, thereby completing the cycle. However, with Agile, software release could be bucketed into two categories of internal and external releases. This helps in creating a loose coupling between the two. Internal releases are made by engineering and business has the option of using one of them as an external release.

In a recent article on the Cutter Consortium (download code RELEASEMYTH), Israel Gat of BMC makes an interesting argument for separating the “two” releases in the software world. According to him the internal and external release should be viewed as two faces of the same coin,

A body of code that delivers certain features and functionalities is one thing. The use of this body of code by marketing and sales to accomplish business results is quite another. Not only do the two activities differ, but they do not necessarily need to be tied together through a 1-to-1 relationship.

He gave an interesting metaphor example of a water pool with two pipes, one for inlet and the other for outlet. He compared engineering to the inlet pipe and business to the outlet pipe.

Think of the in-pipe in this example as engineering and the out-pipe as the business. Engineering can post releases at its own pace. The business can selectively choose from the posted releases. In this paradigm, marketing is not obligated to promote a release upon its completion. Marketing might do so in three months; it might choose to promote the current release with another release due at a later time; it might choose to make a release available on a limited basis; or it might choose never to promote a release.

Israel mentioned that since engineering is now loosely coupled with business, they can move towards a fluid release concept in which the software becomes alive and continuous. Engineering can churn out internal releases at a pace suitable to them and business can make a decision on which release gets to the customer as an external release and when.

Commenting on the article, Ryan provided some additional insights that Israel’s team ran three internal releases to one external release. He suggested that the benefit is to get valuable feedback and business can market the external release better. According to Ryan,

It worked great! As a result, I coach most agile teams to start by making sure their "internal release" cadence is twice as fast at marketing, operations and the market is used to. In this way you get a release where you can gain feedback and steer the "external release" to market better.

According to Israel, with Agile, frequent and faster internal releases make the software more alive and fluid. This renders the traditional release process obsolete. The separation of releases helps both engineering and business to work according to their release patterns without disturbing the release frequency of each other.


clock January 4, 2009 14:04 by author Sky Jia (贾超)

Project Time Management is one of the nine knowledge areas of the Project Management Body of Knowledge (PMBOK). It deals with the definition of activities (what are we going to do), the sequencing of the activities (in what order are we going to do them), and the development and control of the schedule (whenare we going to perform those activities).

Agile Time Management
Over the past couple of weeks I have been trying to find out what the main principles of time management are in the case of agile software development. I was able to distinguish 10 principles so far, and I will present them here for your convenience. With each principle I also include a reference to an online article that (as far as I can tell) nicely describes the ideas behind it. If you don't agree with my list, or if you know some better reference material, feel free to add your thoughts!

1. Use a Definition of "Done"
How? Define what "Done" means and only count the activities that are Done.
Why? Prevent the build-up of hidden tasks ("technical debt") that cost a lot of time to fix down the road.
SeeThe Definition of "Done"

2. Use Timeboxes to Manage Work
How? Set a start- and end date for a collection of activities, and don't allow changes to those dates.
Why? Timeboxes keep people focused on what's most important. Don't lose time to perfectionism.
SeeTime Boxing is an Effective Getting Things Done Strategy

3. Don't Add Slack to Task Estimates
How? Don't use scheduling and buffering of tasks. Add one buffer to the end of the timebox/project.
Why? All safety margins for tasks will be used ("Parkinson's Law" and "Student's Syndrom'").
SeeCritical Chain Scheduling and Buffer Management

4. Defer Decisions
How? Make decisions only at the latest responsible time. "No Decision" is also a decision.
Why? The environment may change, making earlier decisions a waste of time.
SeeReal Options Underlie Agile Practices

5. Reduce Cycle Time
How? Iterative cycles should be as short as possible.
Why? Speed up the learning feedback loop, and decrease the time-to-market.
SeeLean Software Development: Why reduce cycle-time?

6. Keep the Pipeline Short and Thin
How? Limit the amount of work-in-progress, and the number of people working in sequence.
Why? Improve response times, speed up throughput.
SeeManaging the Pipeline

7. Keep the Discipline
How? Prevent expensive rework by doing some processes well, right from the start
Why? Solving problems late in a project is more expensive than following proper rules early.
SeeThe Power of Process

8. Limit Task Switching
How? Prevent unnecessary task switching between projects, and prevent interruptions.
Why? Tasks get completed faster on average, and the human brain is bad at task switching.
SeeHuman Task Switches Considered Harmful

9. Prevent Sustained Overtime
How? Disregard (sustained) overtime as a way to accellerate progress.
Why? Lost productivity, poor quality and bad motivation among team members.
SeeThe Case Against Overtime

10. Separate Urgency from Importance
How? Urgent tasks and important tasks should not be done at the same time.
Why? The important stuff will usually not get done, costing you more time in the long run.
SeeA 10 Second Guide to Smoother Projects: Urgent vs. Important


clock December 3, 2008 19:06 by author Sky Jia (贾超)

 1. 谁是用户?

不应该只把用户定位为最终用户,对于开发一个产品来说,用户包括最终用户、开发者、主管、老板(投资方)、合作伙伴、商业用户等,也就是定义、开发、实现和运营产品所必需面对的所有群体。产品设计中的“设计”具有联系发展的特征,或者说包含设计本身及设计的传达和沟通,所以当你想实施“以用户为中心的设计”的时候,你需要考虑更多的用户,包括坐在你隔壁的伙伴,也就是除了你,其他相关人员都是用户。

2. 本身为谁设计?
回到产品设计本身,我们需要明白产品最直接面对的是谁,最终为谁设计,把矛头对准最终用户,这是产品的主线索也是传达和沟通的主线索,所以从头到尾,保持这个矛头。

那么首先要避免为自己设计,很多时候确实是这样,就算你不承认,虽然你的脑袋里也一直有用户,也明白应该为最终用户设计,但事实总是差强人意。作为设计或开发者,想要忽视自己脑袋里已经存在的认知和行为模式而使用另一个人(用户)的认知和行为模式,这确实很难,但可以通过新建一个记忆通道来更好的避免这个问题,从源头开始就不再是你,没错,就是用户角色的建立和完善,这可以在一定程度上解决错位的问题,这个记忆通道越饱满,就越不容易错位,设计的目的就越准确。

3. 最终用户是谁?
现在需要丰富和鲜活用户角色的一切,那就尽可能的先去了解用户群体,再从用户群体中归纳和总结,千万不要和需求脱离,用户和需求是什么样的关系?鸡和蛋的关系?不如理解为相互验证和制约的关系吧。按照书上说的从以下方法来分析:

  • 靠想象,依据个人的知识和经验、平时的观察体验和积累
  • 用户访谈、现场调查
  • 可用性测试
  • 用户调查
  • 网站流量/日志文件分析

但现实情况可能让你感到困难,发现除了第一条以外,其他的都没条件去做,这很糟糕。没关系,别忘了调动你的团队伙伴们,让他们一起参与,这绝对是件很有价值的途径。另外你必须学会怎么从同类或竞争产品吸取经验,关于这点,有以下一些建议:

  • 体验是一个体系,心智模型不只是征对某种服务,他具有某个或多个领域内的延续性。
  • 先学会做一个用户,保持热情并体验各种各样海量的服务,学会分析和总结。
  • 在比较成功的产品上多下点功夫,他们已经培养了很多用户的行为模式。
  • 在同类或竞争产品的范围上画个稍大点的圈,并尽量细心的使用和体验,做记录很重要
  • 把你的体验结果和经验跟你的伙伴们交流

4. 用户是不是把你带到了沟里?
现在你可能更了解用户了,或者你发现正被用户带到沟里,你的伙伴们也都开始抱怨了,还有比这更糟糕的吗?当然有,那就是到最后才发现被用户带到了沟里,应该说被你的用户带到了沟里,这真是莫大的挫折。原因可能有两个,一是你不了解用户,二是你不会使用用户。所以在你有一定的数据积累及证据呈现前,不要轻举妄动,应该扎实的学习和积累,并学会慢慢的渗透给团对,在整个团队之间引起共鸣和相互交流,这对你本身的提升是一个很有效的途径。谨记一条,多知道一点不代表你可以当专家。

5. 最终用户跟其他用户的关系
如第一点所述,除了最终用户,你必须直接面对一个相当数量的用户群体,这就要求你沟通和论证的角度各种各样,当然你会有很多的办法,比如三寸不烂之舌。再想想产品最终是为谁设计的,那么为什么不通过引入用户角色增强团队对以用户为中心的设计的理解和执行,并建立一个沟通的有效数据规范,而规范实际上就是共识的达成。

最后,不要为了建立用户角色而建立用户角色,还有用户角色不是名词,现在,你可以悄悄地去实验和执行以用户为中心的设计了,经过不断的失败,领悟和积累。


clock December 1, 2008 23:01 by author Sky Jia (贾超)

Search

Calendar

<<  March 2010  >>
SuMoTuWeThFrSa
28123456
78910111213
14151617181920
21222324252627
28293031123
45678910

Categories

Tags