Collect GUI Elements Web

GUI elements can be collected "automatically" or "manually".

Manual GUI Element Collection

GUI elements can be collected after each test step by setting Collect After Step in the parameter file.
The collection of the GUI elements of the starting page can be done by creating a test with Sleep as only step and Collect After Step selected. Collect After Step will not collect GUI elements in frames or iframes. GUI elements in frames or iframes can be colleted by using the action SwitchToIFrame and Collect After Step.

A Spider run can be started after the last step of a test by checking Run Spider after last step in the Run Configuration. The Spider will start over from the Page URL given in the AUT configuration.

Automatic GUI Element Collection

Automatic GUI element collection is called Spider in MyITest4U.

The Spider treats a Web application as a tree. The Spider starts by collecting all GUI elements (HTML tags) of the starting page. In the next step all clickables (link, button ...) are clicked. After each click all new GUI elements are collected. After clicking all clickables the Spider starts over. The Spider goes on till no more new clickables are found.

A Spider run using the default configuration can be the first step in the test creation using MyITest4U.
Such a Spider run will be successful but it might take a lot of time dependent on your Web page. The time it takes for a Spider run might be shortened by following the guide lines given below. Further improvements might be achieved by making use of all the possibilities given under Spider Configuration.

Guide Lines for a Spider Run

Very often a Web page contains GUI elements which appear under every URL of the Web page. An example of such a GUI element is the link to the contact form in the footer. The Spider will click this link on every page as the page URL is part of the GUI element identifier.

Excluding the footer from the Spider run will resolve this issue.

Excludes can also be used for the following situations:

Situation Example
Logged in User Exclude Logout
Exclude GUI elements which are not present. "Login" when a user is logged in.
"Logout" when the user is not logged in.
Exclude URLs "Help", "About" etc. might not change from one release to the next one.
"Hidden" GUI elements GUI elements on the back of a card or in a drop down text

The above situations can be handled by the definition of a Spider Configuration. The same AUT release can have more than one Spider Configuration e.g. for a "Logged In" and "Not Logged In" user.
"Hidden" GUI elements might be found by generating a test clicking all clickables. This can be done using the build in Test Generator (Test Generation / Click All Clickable On Page).

It might be helpfull to collect all the GUI elements of the starting page before starting a Spider run. The found GUI elements should be useful to create your Spider Configuration.

Tests generated by the test generator Click All Clickable On Page might also be useful to collect the GUI elements of your web page as it is possible to collect all GUI elements after each click.

The Spider Configuration gives the possibility to spider frames or not. In case the frames of your page contain only commercials it could be save not to spider frames as you might not test the working of the commercials.

There is no easy way to know if two URL are calling the same page. Therefore you should check if the default Spider Configuration fits your page or if it needs to be changed.

MyITest4U allows the definition of a Search Strategy for GUI elements. You should check if the default Search Strategy fits your needs before you start a Spider run.

Spider Configuration

The Spider can be configured using the test generation editor.
The fine tuning of the Spider can be done using the Java class MyConfig.

There is a global and local configuration. The global configuration is used by all AUTs. The local configuration is used by a specific AUT release.
The Spider configuration is found in the test generation editor under the menu Config. A summary of all configuration entries is found in the Spider Configuration Summary dialog (Config / Spider / Summary). All local values can be set in this dialog.

Excludes

GUI elements defined under Excludes are not clicked in a Spider run. A GUI element can be defined using one of the types given below and the corresponding value. Further it can be defined if the value found in the GUI element should match exact, start with, end with or contain the value given under Excludes. Use CTRL+Space to show entry proposals for Type and Behavior. An empty Value will remove the GUI element from the list.
The table below shows some example uses of Excludes.

Global / Local Excludes

Type Value Behavior Comment
Text Sign In Exact All GUI elements having a text "Sign In" will not be clicked.
Examples:
<button type="submit" class="btn">Sign In</button>
<a class="nav-link" href="./login.php">Sign In</a>
XPath /html/body[1]/footer StartsWith All GUI elements having a XPath starting with "/html/body[1]/footer" will not be clicked.
This can be used to exclude the footer of a web page.
Example:
/html/body[1]/footer[1]/div[1]/nav[1]/div[1]/ul[1]/li[1]

A description of the exclude types is given below.

Exclude Types

Type Description
Frame The value of the "src" attribute of the frame / iframe tag.
Id The value of the "id" attribute of any tag.
Name The value of the "name" attribute of any tag.
Tag The name of a tag.
Text The text as obtained from the WebElement of the Selenium WebDriver.
Url The URL of the page.
XPath The XPath of any GUI element.
Value The value of any "value" attribute of any tag.

Fine tuning of clickable exclusion can be done using the method doNotClick of the class MyConfig.

Spider Attributes and Types

GUI element which are clicked by the Spider are defined by tag attributes and tag types. The default attributes and types are given below.

Attributes:

  • href
  • onclick

Types:

  • button
  • submit
  • reset
  • image

Equal URLs

There is no easy way to know if two URL are calling the same page. For example http://localhost:80 and http://localhost:80/ can be calling the same page but the URL specification says that the two URLs are different and therefore could call two different pages. You can tell MyITest4U which kind of URLs are equal and which are not equal by using the method prepareUrlForDb(String url) of the class MyConfig. The example below shows the default used by MyITest4U. It treats URLs ending with or without "/" equal. URLs linking to different parts of the same page are treated equal too e.g. everything behind "#" and "#" itself is removed from the URL before a comparison is done.

       
    /**
     * Add rules for equal URLs
     * There is no easy way to know if two URL are the same or not.
     * e.g. http://localhost:80 == http://localhost:80/ 
     * The two URL can be equal but following the URL specification they are
     * different.
     * 
     * @param url The url found by Selenium method driver.getCurrentUrl()
     * @return urlForDb This url is inserted into myitest4u_db and used for URL
     *         comparison
     */
     
    @Override
    public String prepareUrlForDb(String url) {

        if (url.indexOf("#") != -1) { 
            url = url.substring(0, url.indexOf("#")); 
        }

        if (url.endsWith("/")) { 
            url = url.substring(0, url.length() - 1);
        }

        return url;
    }
    
      

How are GUI elements found?

GUI elements are found using search criteria provided by Selenium e.g. id, name, css selector ...
You can define a list of search criteria to be used. MyITest4U goes through this list from start to end and uses the first search criteria possible for the GUI element. The last search criteria in the list has to be always set. In the default configuration MyITest4U uses as last search criteria XPath and the following search criteria list:

  • ById
  • ByName
  • ByLinkText
  • ByCssSelectorHref
  • ByXPathText
  • ByXPath

The search criteria ById, ByName, ByLinkText and ByXPath are using the standard selectors provided by Selenium. ByCssSelectorHref uses a CssSelector to get the value of the "href" attribute.
ByXPathText uses the value of the "text" attribute of the GUI element.

You can change the list and/or order of search criteria to be used by changing the method addSearchCriteria() of the class MyConfig.

       
    /**
     * Used to define the list and order of the search criteria to be used. 
* The last search criteria has to be always set. If it is not set no search for * the WebElement can be done.
* ByXPath is always set.
* So make sure it is the last search criteria in the list. * * @param htmlTag can be used to define different search criteria dependent on * tagName etc.
* The example shows the use of XPath for all th, td tags. The * usual order of search criteria is used for all other tags. */ @Override public void addSearchCriteria(HtmlTag htmlTag) { getSearchCriterias().clear(); // if (htmlTag != null && htmlTag.getTagName() != null && // (htmlTag.getTagName().equals("td") // // || htmlTag.getTagName().equals("th"))) { // getSearchCriterias().add(new ByXPath()); // } else { getSearchCriterias().add(new ById()); getSearchCriterias().add(new ByName()); getSearchCriterias().add(new ByLinkText()); getSearchCriterias().add(new ByCssSelectorHref()); getSearchCriterias().add(new ByXPathText()); getSearchCriterias().add(new ByXPath()); // } }

Further it is possible to define a different search strategy. Each search strategy is identified by its name. The search strategy name can be set in the test generation editor in the Run Test Configuration or Spider Configuration dialog by setting a Run configuration name and / or a Spider configuration name. The configuration name can be obtained using the method getRunConfigName() or getSpiderConfigName() of the MyConfig class. The Spider search strategy can be different from the search strategy used to run tests. The search strategy can be set in the method getWebElement(Clickable clickable, Clickable iFrameClickable, WebDriver driver) of the class MyConfig. In the example below the Spider would use the XPath as only search criteria.

       
    /**
     * Insert rules of how a WebElement has to be found. Do not catch the
     * NoSuchElementException -> the main program decides what to do if a
     * NoSuchElementException is thrown.
     *
     * Use getRunConfigName() or getSpiderConfigName() to separate different search strategies.
     * 
     * @param clickable
     * @param iFrameClickable
     * @param driver
     * 
     * @return the webElement to be clicked
     * @throws NoSuchElementException
     **/
    @Override
    public WebElement getWebElement(Clickable clickable, Clickable iFrameClickable, WebDriver driver)
            throws NoSuchElementException, WebDriverException {

        if (getSpiderConfigName().equals("nameOfSpiderConfig")) {
            if (clickable == null) {
                return null;
            }

            setSearchCriteria("By.xpath: " + clickable.getXPath()); //$NON-NLS-1$

            return driver.findElement(By.xpath(clickable.getXPath()));
        }

        return super.getWebElement(clickable, iFrameClickable, driver);
    }

      

The default search strategy is shown in the method below. It uses the search strategy defined in addSearchCriteria().

       
        public WebElement getWebElement(Clickable clickable, Clickable iFrameClickable, WebDriver driver) {

            if (clickable == null) {
                return null;
            }
    
            frameHandler(iFrameClickable, clickable.getIFrameId(), driver);
    
            addSearchCriteria();
    
            for (ISearchCriteria searchCriteria : searchCriterias) {
    
                searchCriteria.setClickable(clickable);
    
                if (searchCriteria.isSet()) {
                    setSearchCriteria(searchCriteria.getUsedToSearch());
                    usedToSearch.add(searchCriteria.getUsedToSearch());
    
                    return searchCriteria.find(driver);
    
                }
            }
    
            return null;
    }

      

Equal HTML Tags

MyITest4U defines an identifier for HTML tags. The Spider uses this identifier to decide if two HTML tags are equal or not. The HTML tag identifier is not used for anything else. It is not used to find GUI elements.

The HTML tag identifier contains the URL of the page on which the tag is found and a configurable string. In the default configuration MyITest4U uses the XPath of the GUI element as the second part of the HTML tag identifier. You can change this behavior by changing the method createIdentifier(HtmlTag htmlTag).
Depending on your AUT, it may make sense to use different identifiers for different GUI elements, e.g. the XPath for all table tags and the optimal search criterion for all other GUI elements. Doing so makes sure that all table cells are stored in the database independent of the cell value and the td tag attributes. The choice of identifier and its effect on the number of GUI elements stored is shown in the table below.

Identifier HTML Stored GUI elements How to find Remarks
XPath ‹tr›
‹td›‹/td›
‹td›‹/td›
‹/tr›
‹tr›
‹td›hansli‹/td›
‹td›hansli‹/td›
‹/tr›
2 td for the empty cell.
2 td for hansli cell.
ByXPathText()
ByXPath
Index has to be used to find the second td.
The XPath for all cells is stored in the database and can be used.
Search criteria ‹tr›
‹td›‹/td›
‹td›‹/td›
‹/tr›
‹tr›
‹td›hansli‹/td›
‹td›hansli‹/td›
‹/tr›
1 td for the empty cell
1 td for hansli cell
ByXPathText()
ByXPath
Index has to be used to find the second td.
The XPath can not be used to find the second td as it is not stored in the database.
XPath Error messages are shown above an element. The element gets two different XPath.
Two test steps are generated by test generators
The same search criteria will find the element with or without an error message except XPath.
Search criteria Error messages are shown above an element. The element is stored once if the search criteria is not XPath. No complications if the search criteria for the element is not XPath.

The choice of search criteria can also be important in a test run.
Suppose all table cells have an unique id and you want to check the text in each table cell. In this case the use of the id as search criteria works fine.
Checking the sorting of the table will only work if the id is not moved with the cell. So it might be better to use the XPath as search criteria as the XPath will always point to the same cell.

Changing the identifier definition for an AUT release might lead to complications and unexpected results.

Which URLs are spidered

Your application might access URLs outside your domain or it might contain URLs which you do not want to be spidered. The method collectFromUrl(WebDriver driver, String rootUrl) of the class MyConfig allows you to configure which URLs are spidered.

       
    /**
     * Use this method to set URLs which should be spidered or not.
     * 
     * @param driver
     * @param rootUrl the root URL of the AUT.
     * @return true if the URL should be spidered otherwise false.
     */
    @Override
    public boolean collectFromUrl(WebDriver driver, String rootUrl) {

        ArrayList validUrls = new ArrayList<>();
        validUrls.add(rootUrl);

        for (String url : validUrls) {
            if (driver.getCurrentUrl().startsWith(url)) {
                return true;
            }
        }

        return false;
    }
       
       
      

Exclude Clickables from Spider Run

As a last resort to exclude clickables from a Spider run, you can use the method doNotClick(Clickable clickable, WebDriver driver, WebElement element) of the class MyConfig. In the default configuration the method doNotClick just returns false.

Manipulate JSoup Document

The Spider uses JSoup to analyze the HTML source. In some cases the page source obtained by Selenium is not the same as seen by Selenium when searching for WebElements. The method manipulateHtml(org.jsoup.nodes.Document doc) of the class MyConfig can be used to manipulate the JSoup Document. A simple example how this can be done is given below.

       
    /**
     * In some cases the pageSource obtained by Selenium is not the same as seen by
     * Selenium when searching for WebElements. Use this method to manipulate the
     * JSoup Document before collecting all the WebElements See commented part for
     * an example.
     * 
     * @param doc
     * @return the manipulated doc.
     */

    @Override
    public org.jsoup.nodes.Document manipulateHtml(org.jsoup.nodes.Document doc) {
        Elements selector = doc.select("head");
       
        for (Element element : selector) {
            element.remove();
        }
        
        doc.select("div").first().remove();

        return doc;
    }
       
      

Run Spider

The Spider can be started from the test generation editor using the menu Run / Run Spider. No test or test block can present for the Spider to run. The Run Spider dialog allows you to configure certain specific parameters. You can save the current settings by clicking Save. This will generate a batch file (RunSPIDER.bat) which can be used to run the Spider. The saved settings can be loaded by clicking on Load.

A screen shot is taken after each click if Take screen shots is checked.
The HTML source of the current page is saved if Save HTML source is checked.
The screen shots and the sources of the HTML pages are saved in the folder ../MyITest4U/Analysis/<AurRelease>/SpiderRun. The files are numbered in the order they are obtained.
No frames are spidered if Do not spider frames is checked.
The log level can be set in the advanced tab in the Run Spider dialog. The level TestLog will show a minimal log. The level Debug will show a lot more information.
Java remote debugging can be used if Debug is checked. The following VM arguments are set:

  • -Xdebug
  • -Xrunjdwp:transport=dt_socket,server=y,address="8000"

The Spider run will do all universal checks like checking for strings which should be found on a page or not. The errors found are written to SpiderFoundBugs.log (..\MyITest4U\TestLog\WebSelenium\SpiderFoundBugs.log).