In this paper, we discuss Etor, an ontology-based tool that uses SWRL rules to automatically identify unethical activity in GitHub Open Source Software (OSS) projects. Both repository and issue/pull request analysis are supported by Etor's architecture, which integrates with the GitHub API and uses extra elements like a license detector and a code similarity checker (AC2) to find infractions.In this paper, we discuss Etor, an ontology-based tool that uses SWRL rules to automatically identify unethical activity in GitHub Open Source Software (OSS) projects. Both repository and issue/pull request analysis are supported by Etor's architecture, which integrates with the GitHub API and uses extra elements like a license detector and a code similarity checker (AC2) to find infractions.

Soft Forks, Silent License Changes, and Self-Promo: Etor Sees It All

Abstract and 1. Introduction

  1. Background and Related Work

  2. Study of Unethical Behavior in OSS

    3.1 RQ1: Types of unethical behavior

    3.2 RQ2: Affected software artifacts

  3. Methodology

    4.1 Modeling via SWRL rules

    4.2 Automatic detection of unethical behavior

  4. Evaluation

  5. Discussion and Implications

  6. Threats to Validity

  7. Conclusion and References

4.2 Automatic detection of unethical behavior

We designed Etor to auto-detect six types. We excluded nine types because (1) they involve artifacts (e.g., product names, software features) that are difficult to automatically isolate from other artifacts (i.e., “No opt-in or no option allowed”, “Privacy Violation”, “Naming confusion”, and “Offensive language”), (2) they require sophisticated analysis of configuration files, API or source code (i.e., “Plagiarism”, “Depending on proprietary software”, and “Vulnerable code/API”), (3) their detection requires advanced natural language processing (i.e., “Closing issue/PR without explanation” as it requires automatically checking if the explanation for closing the PR/issue exists), and (4) approaches for “License incompatibility” [52, 60, 82] exist so we exclude it to avoid reinventing the wheels.

\ Figure 3: Our ontology of unethical behavior in OSS projects

\ Figure 4: Overall architecture of Etor (GH denotes GitHub).

\ Overview of Etor. Figure 4 presents the overall architecture of our automatic detection tool, Etor. Etor supports detection of unethical behavior for two levels, including: (1) repository (denoted as repo), and (2) GitHub issue/pull request (we denote an issue as issue and a pull request as PR). Given a repo or an issue/PR, and the type of unethical behavior eType to be checked, the Etor relies on its set of SWRL rules for its detection, and produces as output whether there is a violation of eType in the given input. Apart from GitHub attributes in Table 2 that can be detected using the GitHub API, our SWRL rule reasoner uses two additional components for its detection: (1) license detector that checks for licenses at the repository level, and (2) code similarity checker that identifies similar code.

\ Supported types. Etor supports six types of unethical behavior. We include the SWRL rules for all supported types in the supplementary material. We next describe how Etor checks each supported type. (S1) No attribution to the author in code. Etor checks if an issue or a PR has a Stack Overflow link representing a reference code, and the code snippet copied from Stack Overflow cites the reference link. Although there can be many resources from which stakeholders copy the reference code, Etor only check for Stack Overflow links because (1) we learned from our study and from existing work [42] that contributors are required to give credit to copied code snippets in Stack Overflows as they are protected by the CC-BY-SA Creative Commons license, and (2) to support other online resources (e.g., GitHub links), we need to automatically extract the original reference code (requires parsing Web pages of different formats), and identify the appropriate license for the code snippet (requires detecting the license for partial code, which is beyond the scope of this paper). Given an issue/PR, Etor checks if a comment b in the issue/PR posted by a stakeholder u1 contains the Stack Overflow link (w) (we use regular expression to extract w). Etor reports a potential violation if: (1) u1 is not the owner of the Stack Overflow comment, (2) the code snippets from Stack Overflow is found in one of the files in the repository (F) with at least 10% similarity (copyright law permits the use of up to 10% of work without permission [20]), and (3) w is not found in F.

\ (S2) Soft forking. Given two repositories r1 and r2, Etor compares the contents of all source files in the two repositories to check if one repository is a soft-fork (the repository has the same content but it is not listed as an official fork of another repository) of another repository. Specifically, we use AC2 [21] to detect the similarities between files. AC2 is a source code plagiarism detection tool that has been widely used by graders to detect plagiarism within a group of assignments. We select AC2 because (1) it supports many programming languages (e.g., C, C++, Java, and PHP), (2) it can be run in a local environment without connection to remote servers, and (3) it is quite robust as it incorporates multiple algorithms found in the literature. Etor reports a violation if it detects: (1) 100% similarity between r1 and r2, and (2) r2 is not in the fork list of r1. (S5) No license provided in public repository. Given a repository r, Etor detects the repo-level license by checking if it exists in the: (1) LICENSE file [22] in the main directory of r, (we check only in the main directory to avoid mistakenly finding API license or package license) or (2) README.md file with license information (we use the list of licenses provided by GitHub [23] for repo-level license detection). Etor reports a potential violation if no license is found after searching for the two files.

\ (S6) Uninformed license change. We consider a change to be uninformed if (1) it is not announced in the CHANGELOG.md or (2) the license change is not done via PR. Given a repository r, Etor checks if the repo-level license has been changed by: (1) extracting commit lists of the license file, and (2) checking if commit changes include license updates. If the license changes occur in more than one commit (we ignore the first commit as it is the initial license creation), Etor checks whether the changes have been announced in the CHANGELOG.md by checking whether the CHANGELOG.md mentions license information. If license information is not found, Etor checks the PR count for the commit (pullRequestCountByCommit). If the count is less than one, Etor marks it as a potential violation.

\ (S8) Self-promotion. We consider self-promotion to be the scenario where a contributor u opens a GitHub issue/PR where the content of the issue/PR includes links to another repository in GitHub to promote his or her own repository. Given an issue/PR for r1 as input, Etor first (1) checks that the issue/PR includes a link L to another repository r2, and (2) identifies the stakeholder u who opens the issue/PR. Then, it reports a violation if: (1) r1 is not r2, (2) u is not a contributor of r1 (i.e., u is an outsider for r1), and (3) u is a contributor of r2. To reduce false positives, Etor also checks if L includes specific keywords that usually indicate that the contributor is sharing the link L for demonstration purposes (e.g., [DEMO]) instead of promoting a repository/library (“\issues\”, “\pull\”, “\commit\”, “\tree\”, “\releases\”, “\blob\”, and “\runs\”).

\ Table 3: Number of issues detected and TP/FP rate

\ (S9) Unmaintained Android Project with Paid Service. This type checks whether an Android project offered paid service in Google Play, but stop actively maintaining the GitHub repository. On average, 115 APIs are updated per month [65], and 49% of app updates have at least one update within 47 days [67]. Based on this frequency of app updates, we define an unmaintained Android project to be an Android project where the latest update is released less than 0.5 year. Given a repository r as input, Etor first checks for unmaintained Android projects by examining whether (1) the latest release date (D) of r is less than 0.5 year, and (2) r is an original repository (not forked from other repositories). Then, it checks whether the app offers a paid service by (1) identifying the Google Play link l from r, and (2) searching for the “in-app purchase”.

\

:::info Authors:

(1) Hsu Myat Win, Southern University of Science and Technology, China (11960003@mail.sustech.edu.cn);

(2) Haibo Wang, Southern University of Science and Technology, China (wanghb2020@mail.sustech.edu.cn);

(3) Shin Hwei Tan, a corresponding author from Southern University of Science and Technology, China (tansh3@sustech.edu.cn).

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

The USDC Treasury burned $50 million worth of USDC on the Ethereum blockchain.

The USDC Treasury burned $50 million worth of USDC on the Ethereum blockchain.

PANews reported on January 22 that, according to Whale Alert monitoring, at 15:55 Beijing time, the USDC Treasury destroyed 50,000,000 USDC (approximately $50.01
Share
PANews2026/01/22 15:59
Crossmint Partners with MoneyGram for USDC Remittances in Colombia

Crossmint Partners with MoneyGram for USDC Remittances in Colombia

TLDR Crossmint enables MoneyGram’s new stablecoin payment app for cross-border transfers. The new app allows USDC transfers from the US to Colombia, boosting financial inclusion. MoneyGram offers USDC savings and Visa-linked spending for Colombian users. The collaboration simplifies cross-border payments with enterprise-grade blockchain tech. MoneyGram, a global leader in remittance services, launched its stablecoin-powered cross-border [...] The post Crossmint Partners with MoneyGram for USDC Remittances in Colombia appeared first on CoinCentral.
Share
Coincentral2025/09/18 21:02
Whales Dump 200 Million XRP in Just 2 Weeks – Is XRP’s Price on the Verge of Collapse?

Whales Dump 200 Million XRP in Just 2 Weeks – Is XRP’s Price on the Verge of Collapse?

Whales offload 200 million XRP leaving market uncertainty behind. XRP faces potential collapse as whales drive major price shifts. Is XRP’s future in danger after massive sell-off by whales? XRP’s price has been under intense pressure recently as whales reportedly offloaded a staggering 200 million XRP over the past two weeks. This massive sell-off has raised alarms across the cryptocurrency community, as many wonder if the market is on the brink of collapse or just undergoing a temporary correction. According to crypto analyst Ali (@ali_charts), this surge in whale activity correlates directly with the price fluctuations seen in the past few weeks. XRP experienced a sharp spike in late July and early August, but the price quickly reversed as whales began to sell their holdings in large quantities. The increased volume during this period highlights the intensity of the sell-off, leaving many traders to question the future of XRP’s value. Whales have offloaded around 200 million $XRP in the last two weeks! pic.twitter.com/MiSQPpDwZM — Ali (@ali_charts) September 17, 2025 Also Read: Shiba Inu’s Price Is at a Tipping Point: Will It Break or Crash Soon? Can XRP Recover or Is a Bigger Decline Ahead? As the market absorbs the effects of the whale offload, technical indicators suggest that XRP may be facing a period of consolidation. The Relative Strength Index (RSI), currently sitting at 53.05, signals a neutral market stance, indicating that XRP could move in either direction. This leaves traders uncertain whether the XRP will break above its current resistance levels or continue to fall as more whales sell off their holdings. Source: Tradingview Additionally, the Bollinger Bands, suggest that XRP is nearing the upper limits of its range. This often points to a potential slowdown or pullback in price, further raising concerns about the future direction of the XRP. With the price currently around $3.02, many are questioning whether XRP can regain its footing or if it will continue to decline. The Aftermath of Whale Activity: Is XRP’s Future in Danger? Despite the large sell-off, XRP is not yet showing signs of total collapse. However, the market remains fragile, and the price is likely to remain volatile in the coming days. With whales continuing to influence price movements, many investors are watching closely to see if this trend will reverse or intensify. The coming weeks will be critical for determining whether XRP can stabilize or face further declines. The combination of whale offloading and technical indicators suggest that XRP’s price is at a crossroads. Traders and investors alike are waiting for clear signals to determine if the XRP will bounce back or continue its downward trajectory. Also Read: Metaplanet’s Bold Move: $15M U.S. Subsidiary to Supercharge Bitcoin Strategy The post Whales Dump 200 Million XRP in Just 2 Weeks – Is XRP’s Price on the Verge of Collapse? appeared first on 36Crypto.
Share
Coinstats2025/09/17 23:42